Study on the Lightweighting of YOLOv5s Model for Precise Detection of Irregular-shaped Components
DOI:
https://doi.org/10.54097/Yr8tCP5bKeywords:
Target Identification, YOLOv5, Lightweighting, Irregular-shaped Parts, DCNv2Abstract
Traditional methods for target recognition face challenges in meeting both precision and speed requirements for precision-shaped parts. In this paper, we propose an enhanced algorithm for precision-shaped part recognition by integrating deep learning theory. To achieve this, we modify the YOLOv5 network. Specifically, we replace the C3 module of the original network's backbone with the C3_ghostnetv2 network, which incorporates Ghostnetv2. This modification results in a lighter network with reduced model parameters and size, thereby improving detection speed. Moreover, we replace the convolution in the original network's neck with Deformable Convolution v2 (DCNv2) to enhance feature extraction for precision-shaped parts. We conduct comparative experiments on a self-made dataset of precision-shaped parts. The experimental results demonstrate that our improved algorithm reduces parameters by 13.7% and model size by 12.5% compared to the original YOLOv5s algorithm, while achieving a 1.4% increase in detection accuracy. The proposed algorithm accurately identifies and classifies precision-shaped machining parts, providing valuable technical support for subsequent intelligent production.
Downloads
References
Zhang, Y., Liang, J., Lu, Q., et al. (2022). A Novel Efficient Convolutional Neural Algorithm for Multi-Category Aliasing Hardware Recognition. Sensors, 22(14), 5358.
Girs hick, R., Donahue, J., Darrell, T., et al. (2014). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 580-587)
Gir s hick, R. (2015). Fast R-CNN. In International Conference on Computer Vision (pp. 1440-1448).
Ren, S., He, K., Gir s hick, R., & Sun, J. (2015). Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks. In Neural Information Processing Systems (pp. 91-99)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., & Reed, S. (2016). SSD: Single Shot Multi Box Detector. In European Conference on Computer Vision (pp. 21-37). Springer International Publishing.
Redmon, J., Div v ala, S., Gir s hick, R., et al. (2016). You Only Look Once: Unified, Real-Time Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 779-788)
Liu, W., Anguelov, D., Erhan, D., et al. (2016). SSD: Single Shot Multi Box Detector. In European Conference on Computer Vision (pp. 21-37). Springer International Publishing.
Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, Faster, Stronger. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 6517-6525).
Redmon, J., & Farhadi, A. (2018). YOLOv3: An Incremental Improvement. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-6).
He, K., Zhang, X., Ren, S., et al. (2015). Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1904-1916.
Lin, T., Dollar, P., Gir s hick, R., et al. (2017). Feature Pyramid Networks for Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 936-944).
Liu, S., Qi, L., Qin, H. F., et al. (2018). Path aggregation network for instance segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8759-8768). Salt Lake City: IEEE.
Howard, A. G., Zhu, M., Chen, B., et al. (2017). Mobile nets: Efficient convolutional neural networks for mobile vision applications. a r Xi v preprint arXiv:1704.04861.
Sandler, M., Howard, A., Zhu, M., et al. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4510-4520).
Howard, A., Sandler, M., Chu, G., et al. (2019). Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1314-1324).
Zhang, X., Zhou, X., Lin, M., et al. (2018). Shu ff le net: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 6848-6856).
Ma, N., Zhang, X., Zheng, H. T., et al. (2018). Shu ff le net v2: Practical guidelines for efficient c architecture design. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 116-131).
Han, K., Wang, Y., Tian, Q., et al. (2020). Ghost net: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1580-1589).
Liu, Y., Shao, Z., Hoffmann, N. (2021). Global attention mechanism: Retain information to enhance channel-spatial interactions. a r Xi v preprint arXiv:2112.05561.
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7132-7141).
Woo, S., Park, J., Lee, J. Y., et al. (2018). CBAM: Convolutional Block Attention Module. In Proceedings of the European conference on computer vision (ECCV) (pp. 3-19).
Zheng, Z., Wang, P., Liu, W., et al. (2020). Distance-Io U loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 12993-13000).
Gevorgyan, Z. (2022). Sio U loss: More powerful learning for bounding box regression. A r X iv preprint arXiv:2205.12740.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Frontiers in Computing and Intelligent Systems

This work is licensed under a Creative Commons Attribution 4.0 International License.

