Research on Human Action Recognition Method Based on Machine Learning

Yushan Li

doi:10.54097/eezedv44

Authors

Yushan Li

DOI:

https://doi.org/10.54097/eezedv44

Keywords:

Human action; recognition method; machine learning; research progress.

Abstract

Human action recognition (HAR) is an important research direction in the field of computer vision, and human action recognition technology provides an important foundation for computers to understand and simulate human behavior. It has been widely used in many practical applications, including human-computer interaction, gesture recognition, motion analysis, motion capture, virtual reality and augmented reality. Early methods were mainly based on the characteristics of manual design and traditional machine learning methods, such as support vector machine (SVM) and Random Forest. These methods usually rely on manually extracted features, such as edge, texture and color information, but they do not perform well in complex scenes and changing environments. With the rise of deep learning, especially the successful application of Convolutional Neural Network (CNN), great breakthroughs have been made in human action recognition. Using CNN, we can learn the feature representation directly from the original image data, which avoids the trouble of manually designing features. This paper expounds the research progress and future development trend of human action recognition methods based on machine learning.

Downloads

Download data is not yet available.

References

Wang H, Schmid C. Action recognition with improved trajectories[C]//Proceedings of the IEEE international conference on computer vision. 2013: 3551-3558.

Kong Y, Fu Y. Human action recognition and prediction: A survey[J]. International Journal of Computer Vision, 2022, 130(5): 1366-1401.

Niu W, Long J, Han D, et al. Human activity detection and recognition for video surveillance[C]//2004 IEEE international conference on multimedia and expo (ICME) (IEEE Cat. No. 04TH8763). IEEE, 2004, 1: 719-722.

Ramezani M, Yaghmaee F. A review on human action analysis in videos for retrieval applications[J]. Artificial Intelligence Review, 2016, 46: 485-514.

Lou M, Li J, Wang G, et al. AR-C3D: Action recognition accelerator for human-computer interaction on FPGA[C]//2019 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2019: 1-4

Johansson G. Visual perception of biological motion and a model for its analysis[J]. Perception & psychophysics, 1973, 14: 201-211.

Aaron F, Bobick, James W, et al. The Recognition of Human Movement Using Temporal Templates[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001,23(3):257-267.

Gorelick L, Blank M, Shechtman E, et al. Actions as Space-Time Shapes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007,29(12):2247-2253.

Daniel W, Remi R, Edmond B. Free Viewpoint Action Recognition Using Motion History Volumes[J]. Computer Vision and Image Understanding, 2006,104(2):249-257.

Yilmaz A, Shah M. Actions Sketch: A Novel Action Representation[C]. IEEE Computer Society Conference on Computer Vision & Pattern Recognition, 2005.

Dollár P, Rabaud V, Cottrell G, et al. Behavior Recognition via Sparse Spatio-Temporal Features[C]. 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance. IEEE, 2005: 65-72.

Laptev I. On Space-Time Interest Points[J]. International Journal of Computer Vision, 2005,64(2):107-123.

Liu J, Luo J, Shah M. Recognizing Realistic Actions from Videos in the Wild[C]: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 2009.

Klser A, Marszalek M, Schmid C. a Spatio-Temporal Descriptor Based on 3D-Gradients[C]: British Machine Vision Conference, 2010.

Wang H, Schmid C. Action Recognition with Improved Trajectories[C]: 2013 IEEE International Conference on Computer Vision, 2014.

Simonyan K, Zisserman A. Two-stream Convolutional Networks for Action Recognition in Videos[J]. Advances in Neural Information Processing Systems, 2014, 27.

Feichtenhofer C, Pinz A, Wildes R P. Spatiotemporal Multiplier Networks for Video Action Recognition[C]: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.

Wang L, Qiao Y, Tang X. Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors[C], 2015.

Wang L, Xiong Y, Wang Z, et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition[C]. European Conference on Computer Vision. Springer, Cham, 2016: 20-36.

Diba A, Sharma V, Van Gool L. Deep Temporal Linear Encoding Networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2329-2338.

Li Qinghui, Li Aihua, Wang Tao, et al. Behavior identification based on ordered optical flow diagram and double-stream convolutional network [J]. Journal of Optics, 2018,38(06):234-240.

Ji S, Yang M, Yu K. 3D Convolutional Neural Networks for Human Action Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013,35(1):221-231.

Tran D, Bourdev L, Fergus R, et al. Learning Spatiotemporal Features with 3D Convolutional Networks[C], 2015.

Hounsfield G N. Computerized transverse axial scanning (tomography): Part 1. Description of system[J]. The British journal of radiology, 1973, 46(552): 1016-1022.

Qiu Z, Yao T, Mei T. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks[C]: Proceedings of the IEEE International Conference on Computer Vision, 2017.

Qiu Z, Yao T, Ngo C, et al. Learning Spatio-Temporal Representation with Local and Global Diffusion[C]: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.

Luo C, Yuille A. Grouped Spatial-Temporal Aggregation for Efficient Action Recognition[C]: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.

Piergiovanni A, Angelova A, Toshev A, et al. Evolving Space-Time Neural Architectures for Videos[C]: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.

Donahue J, Hendricks L A, Rohrbach M, et al. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(4):677-691.

Yue-Hei Ng J, Hausknecht M, Vijayanarasimhan S, et al. Beyond Short Snippets: Deep Networks for Video Classification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 4694-4702.

Liu J, Shahroudy A, Xu D, et al. Spatio-temporal LSTM with Trust Gates for 3d Human Action Recognition[C]. European Conference on Computer Vision. Springer, Cham, 2016: 816-833.

Song S, Lan C, Xing J, et al. Spatio-Temporal Attention-Based LSTM Networks for 3D Action Recognition and Detection[J]. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 2018,27(7):3459-3471.

Srivastava N, Mansimov E, Salakhudinov R. Unsupervised Learning of Video Representations using LSTMs[C]. International Conference on Machine Learning. PMLR, 2015: 843-852.

Research on Human Action Recognition Method Based on Machine Learning

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Indexing

Latest publications