Research Overview of YOLO Series Object Detection Algorithms Based on Deep Learning
DOI:
https://doi.org/10.54097/p81rtv77Keywords:
YOLO, Target detection, Algorithmic history of YOLO, Deep learning, Convolutional neural networksAbstract
In the rapid development of deep learning, YOLO, as the first popular single-stage object detection model, has sparked an innovative storm in the computer vision community with its remarkable architecture and innovative concepts, marking a significant leap forward in object detection technology. Today, it is not only regarded as a milestone in this field but also sets an unparalleled example in the pursuit of the perfect fusion of detection speed and accuracy. YOLO has been widely applied in various fields such as agriculture, industry, pedestrian detection, and more. This research project will first introduce traditional object detection methods, then analyze object detection based on deep learning, and subsequently elaborate on the fundamental concepts of YOLO. It will systematically sort through the YOLO family and its significant improvements. Finally, based on different improvement strategies or application scenarios, the YOLO algorithm will be systematically classified and summarized.
References
[1] Mizen & Lian Zhe. A Research Review of YOLO Methods for Generalized Target Detection. Computer Engineering and Applications 1-19.
[2] HINTON G, OSINDERO S, TEH Y. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527-1554.
[3] SALAKHUTDINOV R, MNIH A, HINTON G. Restricted Boltzmann machines for collaborative filtering[C]//Proceedings of the 24th international conference on Machine learning. ACM, 2007: 791-798.
[4] Shao, Y. H., Zhang, D., Chu, H. Y., Zhang, X. Q. & Rao, Y. B... (2022). A review of YOLO target detection based on deep learning. Journal of Electronics and Information (10), 3697-3708.
[5] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
[6] Xinjie Wang & Jiping Wang. (2024). A review of YOLO target detection algorithms. Guangxi Physics (02),50-53.
[7] Joseph Redmon,Santosh Kumar Divvala,Ross B. Girshick & Ali Farhadi.(2015).You Only Look Once: Unified, Real-Time Object Detection..CoRR
[8] Joseph Redmon & Ali Farhadi. (2016).YOLO9000: Better, Faster, Stronger..CoRR
[9] Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934.
[10] Ge, Z., Liu, S., Wang, F., Li, Z., & Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430.
[11] Wu Weihao, Li Qing. Electrical connector defect detection based on improved Yolo v3 [J]. Journal of Sensing Technology, 2020, 33(2):299-307.
[12] Wang Zhuo, Wang Jian, Wang Lingxiong, et al. A lightweight detection method for apples in natural environment based on improved YOLO v4 [J]. Journal of Agricultural Machinery, 2022, 53(8): 294-302
[13] Huiping Pan, Minqin Wang, Fuquan Zhang. Traffic sign detection and recognition method based on optimized YOLO-V4 [J]. Computer Science, 2022, 49(11): 179-184.
[14] B. Zhao, C. Wang, Q. Fu. A multi-scale infrared pedestrian detection method for saliency background awareness[J]. Journal of Electronics and Information, 2020, 42(10): 2524-2532. doi: 10. 11999/JEIT190761.
[15] Zou, Jun, Zhang, S. Y. & Li, J.. (2023). A review of deep learning-based target detection algorithms. Sensor World (08), 9-15. doi: 10.16204/j.sw.issn.1006-883X.2023.08.002.
[16] G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to wordnet: An on-line lexical database. International journal of lexicography, 3(4):235–244, 1990.
[17] JOSEPH NELSON, JACOB SOLAWETZ. YOLOv5 is Here: State-of-the-Art Object Detection at 140 FPS [EB/OL]. [2020-06-10]. https://blog.roboflow.com/yolo v5-is-here/.
[18] LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J]. arxiv preprint arxiv:2209.02976, 2022.
[19] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 7464-7475.
[20] JAMES GALLAGHER. How to Train an Ultralytics YOLOv8 Oriented Bounding Box (OBB) Model. [2024-02-06]. https://blog.roboflow.com/train-yolov8-obb-model/.
[21] FANG Y, LIAO B, WANG X, et al. You only look at one sequence: Rethinking transformer in vision through object detection[J]. Advances in Neural Information Processing Systems, 2021, 34: 26183-26197.
[22] Huang Y, Zhou Chun, Liu XJ & Chen Q. YOLOv10-based multi-scale detection model for UAVs in complex backgrounds. Optical Communication Research 1-8.
[23] WANG C Y, YEH I H, LIAO H Y M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information[J]. arxiv preprint arxiv:2402.13616, 2024.
[24] CHEN Y, YUAN X, WU R, et al. YOLO-MS: rethinking multi-scale representation learning for real-time object detection[J]. arxiv preprint arxiv:2308.05480, 2023.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.