A Lightweight Object Detection Network for UAV Aerial Images
DOI:
https://doi.org/10.54097/c5q8fv57Keywords:
UAV, YOLOv7-tiny, Loss Function, Lightweight ModelAbstract
In order to solve the problems of poor detection algorithms, high network model complexity, and difficult deployment of algorithms in the field of aerial image target detection. In this paper, based on YOLOv7-tiny algorithm, a lightweight target detection network for UAV aerial images is designed. Partial convolutional PConv is introduced into the network, and the feature extraction block ELAN is improved, which reduces the computational volume of convolution and the number of model parameters in the feature extraction process, thus solving the problem of model lightweight. The feature fusion part of the network is optimal to improve the feature extraction ability of the network for small targets. At the same time, the large target detection layer in the original network is replaced with the small target detection layer in the aerial images, and the attention mechanism is embedded in the backbone network, which solves the problem of imperfect detection algorithms in aerial images. The loss function of the network is improved so that the prediction frames generated by the detection network and the truth frames match each other in the regression process, thus improving the training process of the network. The experimental results on the publicly available dataset VisDrone2019 dataset show that compared with the YOLOv7-tiny algorithm, the detection accuracy of the proposed model is improved by 0.7%, the recall R is improved by 2.2%, the F1 value is improved by 1.6%, the average detection accuracy mean is improved by 2.3%, and the number of parameters is reduced by 52.1%. Moreover, the image detection speed FPS reaches 66/f.s-1, which meets the real-time requirements of the aerial image detection model detection, and provides a research idea for the field of UAV aerial image detection.
Downloads
References
Llanes L A C, Ulbis C R H, Garcia R G. Remote Controlled Unmanned Water Vehicle with Human Detection and GPS Using Yolov4 for Flood Search Operations[C]//2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 2023, 1: 373-379.
Ahmed I, Jeon G, Chehri A, et al. Adapting Gaussian YOLOv3 with transfer learning for overhead view human detection in smart cities and societies[J]. Sustainable Cities and Society, 2021, 70: 102908.
Lin Y, Chen T, Liu S, et al. Quick and accurate monitoring peanut seedlings emergence rate through UAV video and deep learning[J]. Computers and Electronics in Agriculture, 2022, 197: 106938.
Wang X, Yang W, Lv Q, et al. Field rice panicle detection and counting based on deep learning[J]. Frontiers in Plant Science, 2022, 13: 966495.
Xu X, Wang L, Shu M, et al. Detection and Counting of Maize Leaves Based on Two-Stage Deep Learning with UAV-Based RGB Image[J]. Remote Sensing, 2022, 14(21): 5388.
Zaidi S S A, Ansari M S, Aslam A, et al. A survey of modern deep learning based object detection models[J]. Digital Signal Processing, 2022, 126: 103514.
Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]//European conference on computer vision. Cham: Springer International Publishing, 2020: 213-229.
Zhang H, Li F, Liu S, et al. Dino: Detr with improved denoising anchor boxes for end-to-end object detection[J]. arXiv preprint arXiv:2203.03605, 2022.
Chen T, Saxena S, Li L, et al. A unified sequence interface for vision tasks[J]. Advances in Neural Information Processing Systems, 2022, 35: 31333-31346.
Wang W, Xie E, Li X, et al. Pvt v2: Improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022, 8(3): 415-424.
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2980-2988.
Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, 2016: 21-37.
Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
Redmon J, Farhadi A. YOLO9000: better, faster, stronger [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.
Redmon J, Farhadi A. Yolov3: An incremental improvement [J]. arXiv preprint arXiv:1804.02767, 2018.
Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
Li C, Li L, Jiang H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J]. arXiv preprint arXiv:2209.02976, 2022.
Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022[J]. arXiv preprint arXiv: 2207. 02696, 2022.
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.
Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.
Dai J, Li Y, He K, et al. R-fcn: Object detection via region-based fully convolutional networks[J]. Advances in neural information processing systems, 2016, 29.
Srivastava S, Narayan S, Mittal S. A survey of deep learning techniques for vehicle detection from UAV images[J]. Journal of Systems Architecture, 2021, 117: 102152.
Liu B, Luo H, Wang H, et al. YOLOv3_ReSAM: A small-target detection method[J]. Electronics, 2022, 11(10): 1635.
Tan L, Lv X, Lian X, et al. YOLOv4_Drone: UAV image target detection based on an improved YOLOv4 algorithm[J]. Computers & Electrical Engineering, 2021, 93: 107261.
Liu H, Duan X, Lou H, et al. Improved GBS-YOLOv5 algorithm based on YOLOv5 applied to UAV intelligent traffic[J]. Scientific Reports, 2023, 13(1): 9577.
Li Y, Yuan H, Wang Y, et al. GGT-YOLO: a novel object detection algorithm for drone-based maritime cruising[J]. Drones, 2022, 6(11): 335.
Zhao L L, Zhu M L. MS-YOLOv7: YOLOv7 Based on Multi-Scale for Object Detection on UAV Aerial Photography[J]. Drones, 2023, 7(3): 188.
Zeng Y, Zhang T, He W, et al. YOLOv7-UAV: An Unmanned Aerial Vehicle Image Object Detection Algorithm Based on Improved YOLOv7[J]. Electronics, 2023, 12(14): 3141.
Li Y, Fan Q, Huang H, et al. A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition[J]. Drones, 2023, 7(5): 304.
CHEN J, KAO S, HE H, et al. Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 12021-12031.
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
Zheng Z, Wang P, Liu W, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]// Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 12993-13000.
Gevorgyan Z. SIoU loss: More powerful learning for bounding box regression[J]. arXiv preprint arXiv:2205.12740, 2022.
Zhang Y F, Ren W, Zhang Z, et al. Focal and efficient IOU loss for accurate bounding box regression. arXiv 2021[J]. arXiv preprint arXiv:2101.08158.
Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 658-666.
Tong Z, Chen Y, Xu Z, et al. Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism[J]. arXiv preprint arXiv:2301.10051, 2023.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Frontiers in Computing and Intelligent Systems

This work is licensed under a Creative Commons Attribution 4.0 International License.