Despite recent advances in the design of features to improve human action recognition, color information has usually been ignored or not effectively used. In this paper, we propose a new type of descriptor for human action recognition named color similarity descriptor. Our new descriptor is based on the color descriptors, and is using the color similarity information between relevant video patches to represent video clip. The proposed new descriptor takes advantage of the color information, meanwhile it is more efficient and robust compare to the original color descriptors. We have evaluated the performance of the proposed descriptor on three challenge public datasets: YouTube, UCF Sports and UCF50 datasets. The performance of the proposed descriptor is competitive compare to the state-of-the-art methods, and on UCF50 dataset, our result outperform the best reported result up to 3.9%.