Human action recognition in videos using distance image volumes and sparse coding
Human action recognition in video has been a difficult challenge due to the diversity of different action of different actor. In this paper, distance image volume is proposed to describe actions in videos, which is combined with sparse coding method for human action recognition. Different from other approaches that use both appearance and motion features or temporal-spatial interest points for better recognition performance, our approach only uses the distance image volumes that can present the appearance and motion information of human action but with low computation. For action recognition, we propose the use of sparse coding (SC), in which dictionary is represented by distance image volumes of training action videos. Each test video is then constructed by a linear combination of the basis vectors of the dictionary. The residuals between the test video and each action class are computed for determining the identity. The experiments are conducted on publicly available data set Weizmann. The results demonstrate that distance image volume feature combined with sparse coding method helps in achieving improved performance.