Research Advanced in Object Detection based on Deep Learning
DOI:
https://doi.org/10.54097/tax5ym24Keywords:
computer vision, natural language process, Object detection, anchor-free.Abstract
Object detection has always been a fundamental research topic in the computer vision community, which focuses on predicting the category and location of all objects in the scene. In last several years, progressing from the rapid development of deep learning, the speed and accuracy of general object detection methods have also achieved significant breakthroughs. This paper aims to report the latest research progress in the field of object detection based on deep learning to inspire and promote subsequent research. Specifically, this paper systematically introduces the research progress of predecessors from four aspects: dual-stage, single-stage, Transformer-based and key point, including the design ideas and basic processes of representative algorithms. In addition, this paper also quantitatively compares the performance of different methods on common data sets to further distinguish the benefits and disadvantages of different categories of methods. Finally, this paper summarizes the challenges that still exist in the field of object detection and looks forward to future development directions.
Downloads
References
[1] Girshick R. Fast R-CNN. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). Santiago: IEEE, 2015. 1440–1448.
[2] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv: 1506.01497, 2015.
[3] NEUBECK A, VAN GOOL L. Efficient non-maximum suppression [C]//Proceedings of the 18th International Conference on Pattern Recognition, Hong Kong, China, Aug 20- 24, 2006. Piscataway: IEEE, 2006: 850-855.
[4] QIN Y, HE S, ZHAO Y, et al. RoI pooling based fast multi domain convolutional neural networks for visual tracking [C]//Proceedings of the 2016 International Conference on Artificial Intelligence and Industrial Engineering, Beijing, Nov 20-21, 2016: 198-202.
[5] He KM, Gkioxari G, Dollár P, et al. Mask R-CNN. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, 2017. 2980–2988.
[6] J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of the 2016 Unified, real-time object detection. Proceedings of the 2016 Recognition (CVPR). Las Vegas: IEEE, 2016. 779–788.
[7] LI G, SONG Z, FU Q. A new method of image detection for small datasets under the framework of YOLO network [C]//Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference, Chongqing, Oct 12-14, 2018. Piscataway: IEEE, 2018: 1031- 1035.
[8] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]//LNCS 9905:Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 21-37.
[9] CHEN H J, WANG Q Q, YANG G W, et al. SSD object detection algorithm with multi-scale convolution feature fusion [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13 (6): 1049-1061.
[10] HOU Q S, XING J S. SSD object detection algorithm based on KL loss and Grad-CAM [J]. Acta Electronica Sinica, 2020, 48 (12): 2409-2416.
[11] LIN T Y, GOYAL P, GIRSHICK R, et al. focal loss for dense object detection [J]. arXiv:1708.02002, 2017.
[12] HO Y, WOOKEY S. The real-world-weight crossentropy loss function: modeling the costs of mislabeling [J]. IEEE Access, 2019, 8: 4806-4813.
[13] TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 10781-10790.
[14] AO L, ZHANG X, PU J, et al. The field wheat count based on the EfficientDet algorithm [C]//Proceedings of the 2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education, Dalian, Sep 27-29, 2020. Piscataway: IEEE, 2020: 557-561.
[15] FA Z W, YAN W Y, SHUIYUAN D, et al. Research on location of chinese handwritten signature based on EfficientDet[C]// Proceedings of the 2021 IEEE 4th International Conference on Big Data and Artificial Intelligence, Qingdao, Jul 2- 4, 2021. Piscataway: IEEE, 2021:192-198.
[16] MEKHALFI M L, NICOLÒ C, BAZI Y, et al. Contrasting YOLOv5, Transformer, and EfficientDet detectors for crop circle detection in desert [J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-15.
[17] Zou Z, Chen K, Shi Z, et al. Object detection in 20 years: A survey [J]. Proceedings of the IEEE, 2023, 111(3): 257-276.
[18] Wu X, Sahoo D, Hoi S C H. Recent advances in deep learning for object detection [J]. Neurocomputing, 2020, 396: 39-64.
[19] Law H, Deng J. Cornernet: Detecting objects as paired keypoints [C]//Proceedings of the European conference on computer vision (ECCV). 2018: 734-750.
[20] Zhou X, Wang D, Krähenbühl P. Objects as points[J]. arXiv preprint arXiv:1904.07850, 2019.
[21] Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers [C]//European Conference on Computer Vision. Cham: Springer International Publishing,2020: 213-229.
[22] DAI J, QI H, XIONG Y, et al. Deformable convolutional networks [C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 764-773.
[23] DAI Z G, CAI B L, LIN Y G, et al.UP-DETR: Unsupervised pre-training for object detection with transformers [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 1601-1610.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







