Research on Improved YOLOv4-based Pedestrian Detection Model

Authors

  • Ruiqi Hu
  • Yujun Li

DOI:

https://doi.org/10.54097/hset.v7i.1090

Keywords:

Pedestrian detection, YOLOv4, RepVGG Block, SENet attention, SIoU loss function

Abstract

Pedestrian safety affects many fields such as driverless vehicles. In order to improve the precision of the pedestrian detection method and distinguish people-like objects from pedestrians, an improved YOLOv4 pedestrian detection method is proposed. Firstly, RepVGG Block is introduced into the feature extraction layer and feature fusion layer of YOLOv4 to improve the feature extraction ability of the network and reduce the loss of feature information. Then the SENet attention mechanism is introduced to make the algorithm focus more on the useful information. Finally, the SIoU loss function is introduced to the regression of the pedestrian target frame, which improves the convergence speed and reduces the blindness of the target frame. Experimental results show that, compared with the original YOLOv4 algorithm, this improved algorithm has higher detection precision, and it can also distinguish people from people-like objects on the published Pedestrian Detection Data set, with a detection precision of 83.7%.

Downloads

Download data is not yet available.

References

Xin Yu., Gao Hongbo., Zhao Jianhui., & Zhou Mo. (2018). Overview of deep learning intelligent driving methods. Journal of Tsinghua University (Natural Science Edition) 58(04), 438-444.

Geng Yining., Liu Shuaishi., Liu Taiting., Yan Wenyang., & Lian Yufeng. (2021). Survey of pedestrian detection technology based on computer vision. Journal of Computer Applications 41(S1), 43-50.

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580-587).

Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448).

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Springer, Cham.

Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271).

Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv: 1804. 02767.

Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and precision of object detection. arXiv preprint arXiv: 2004.10934.

Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988).

Zhang Mengge., Li Jian., & Chen Huiwen. (2022). Real-time video pedestrian detection algorithm based on YoloV3. Microcontrollers & Embedded Systems, 22(06), 29-32.

Hao Xuzheng., & Chai Zhengyi. (2019). Improved pedestrian detection method based on depth residual network. Application Research of Computers, 36(05), 1569-1572+1584.

Shi Ruijiao., Chen Houjin., Li Jupeng., Li Yanfeng., Li Feng., & Wan Chengkai. (2022). Small-scale pedestrian detection algorithm in railway scene based on attention and multi-level feature fusion. Journal of the China Railway Society, 44(05),76-83.

Li Xiaoyan, Fu Huitong, Niu Wentao, Wang Peng, Lv Zhigang & Wang Weiming. (2022). Multimodal pedestrian detection algorithm based on deep learning. Journal of Xi'an Jiaotong University, (10), 76-83.

Kang Shuai, Zhang Jianwu, Zhu Zunjie & Tong Guofeng. (2021). Improved YOLOv4 algorithm for pedestrian detection in complex visual scene. Telecommunications Science, 37(08), 46-56.

Wang, C. Y., Liao, H. Y. M., Wu, Y. H., Chen, P. Y., Hsieh, J. W., & Yeh, I. H. (2020). CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 390-391).

He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 37(9), 1904-1916.

Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8759-8768).

Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., & Sun, J. (2021). Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 13733-13742).

Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7132-7141).

Gevorgyan, Z. (2022). SIoU Loss: More Powerful Learning for Bounding Box Regression. arXiv preprint arXiv:2205.12740.

Karthika, N. J., & Chandran, S. (2020). Addressing the False Positives in Pedestrian Detection. In Electronic Systems and Intelligent Computing (pp. 1083-1092). Springer, Singapore.

Downloads

Published

03-08-2022

How to Cite

Hu, R., & Li, Y. (2022). Research on Improved YOLOv4-based Pedestrian Detection Model. Highlights in Science, Engineering and Technology, 7, 313-322. https://doi.org/10.54097/hset.v7i.1090