Improved YOLOV7-TINY Network for Sea Bream Detection

Authors

  • Linhua Jiang
  • Yuanyuan Yang
  • Entuo Liu
  • Lingxi Hu
  • Jiahao Xu
  • Lei Chen
  • Jintao Zhang
  • Peng Liu
  • Wei Long

DOI:

https://doi.org/10.54097/c6kqnn36

Keywords:

Cascaded feature fusion, Attention mechanism, Improved YOLOV7-TINY network, ECIOU

Abstract

Accurate identification of underwater fish species is of great scientific and economic significance in aquaculture, as it can provide scientific basis for aquaculture production and promote related research. However, the complexity of the underwater environment is affected by various factors such as light, water quality, and mutual occlusion of fish species. Therefore, underwater fish images are often not clear enough, which limits the accurate identification of underwater targets. In this paper, an improved YOLOV7-TINY model for sea bream detection is proposed. We employ FasterNet to replace the backbone network of the YOLOV7-TINY model, further reducing model parameters and computational complexity without compromising accuracy. By leveraging cascaded feature fusion in the backbone network, we effectively address the challenges posed by multi-scale datasets and insufficient information extraction. Additionally, the RESNETCBAM attention mechanism is incorporated into the feature maps at three different scales, allowing the network to better capture relevant information from complex underwater environments while minimizing unnecessary interference. Finally, the ECIOU loss function is adopted to optimize frame adjustments and reduce the training time of the model.

References

[1] Xu Hai, Xie Hongtao, and Zhang Yongdong, "Advances in Visual Domain Generalization Techniques and Research," Journal of Guangzhou University (Natural Science Edition), vol. 21, no. 2, pp. 42–59, 2022.

[2] N. J. C. Strachan, P. Nesvadba, and A. R. Allen, "Fish species recognition by shape analysis of images," Pattern Recognition, vol. 23, no. 5, pp. 539–544, January 1990, doi: 10.1016/0031-3203(90)90074-U.

[3] N. Castignolles, M. Cattoen, and M. Larinier, "Identification and counting of live fish by image analysis," presented at IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology, S. A. Rajala and R. L. Stevenson, eds., San Jose, CA, March 1994, pp. 200–209. doi: 10.1117/12.171067.

[4] D.-J. Lee, R. B. Schoenberger, D. Shiozawa, X. Xu, and P. Zhan, "Contour matching for a fish recognition and migration-monitoring system," presented at Optics East, K. G. Harding, ed., Philadelphia, PA, December 2004, p. 37. doi: 10.1117/12.571789.

[5] Ding Shunrong and Xiao Ke, "Research on fish classification method based on particle swarm optimization SVM and multi-feature fusion," Chinese Journal of Agricultural Mechanization, vol. 41, no. 11, pp. 113–118, 170, 2020, doi: 10.13733/j.jcam.issn.2095-5553.2020.11.018.

[6] Yao Runlu, Gui Yongwen, and Huang Qiugui, "Freshwater fish species recognition based on machine vision," Journal of Microcomputers and Applications, vol. 36, no. 24, pp. 37–39, 2017, doi: 10.19358/j.issn.1674-7720.2017.24.011.

[7] P. Cisar, D. Bekkozhayeva, O. Movchan, M. Saberioon, and R. Schraml, "Computer vision based individual fish identification using skin dot pattern," Sci Rep, vol. 11, no. 1, p. 16904, August 2021, doi: 10.1038/s41598-021-96476-4.

[8] R. B. Dala-Corte, J. B. Moschetta, and F. G. Becker, "Photo-identification as a technique for recognition of individual fish: a test with the freshwater armored catfish Rineloricaria aequalicuspis Reis & Cardoso, 2001 (Siluriformes: Loricariidae)," Neotrop. ichthyol., vol. 14, no. 1, 2016, doi: 10.1590/1982-0224-20150074.

[9] Chen Feifen, "Research and application of water meter reading recognition based on deep learning," Master's thesis, Guilin University of Electronic Technology, 2022. doi: 10.27049/d.cnki.ggldc.2021.000020.

[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Commun. ACM, vol. 60, no. 6, pp. 84–90, May 2017, doi: 10.1145/3065386.

[11] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv, April 10, 2015. Accessed: March 14, 2024. [Online]. Available: http://arxiv.org/abs/1409.1556

[12] W. Zaremba, I. Sutskever, and O. Vinyals, "Recurrent Neural Network Regularization," arXiv, February 19, 2015. Accessed: March 14, 2024. [Online]. Available: http://arxiv.org/abs/1409.2329

[13] K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber, "LSTM: A Search Space Odyssey," IEEE Trans. Neural Netw. Learning Syst., vol. 28, no. 10, pp. 2222–2232, October 2017, doi: 10.1109/TNNLS.2016.2582924.

[14] I. Goodfellow et al., "Generative Adversarial Nets."

[15] C. Szegedy et al., "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA: IEEE, June 2015, pp. 1–9. doi: 10.1109/CVPR.2015.7298594.

[16] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA: IEEE, June 2016, pp. 770–778. doi: 10.1109/CVPR.2016.90.

[17] A. G. Howard et al., "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv, April 16, 2017. Accessed: March 14, 2024. [Online]. Available: http://arxiv.org/abs/1704.04861

[18] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT: IEEE, June 2018, pp. 4510–4520. doi: 10.1109/CVPR.2018.00474.

[19] A. Howard et al., "Searching for MobileNetV3," in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, October 2019, pp. 1314–1324. doi: 10.1109/ICCV.2019.00140.

[20] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," in 2017

[21] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation."

[22] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, June 2017, doi: 10.1109/TPAMI.2016.2577031.

[23] K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," presented at Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2961–2969. Accessed: March 14,2024. [Online]. Available: https: //openaccess.thecvf.com/content_iccv_2017/html/He_Mask_R-CNN_ICCV_2017_paper.html

[24] W. Liu et al., "SSD: Single Shot MultiBox Detector," in Computer Vision – ECCV 2016, vol. 9905, B. Leibe, J. Matas, N. Sebe, and M. Welling, eds., Lecture Notes in Computer Science, vol. 9905., Cham: Springer International Publishing, 2016, pp. 21–37. doi: 10.1007/978-3-319-46448-0_2.

[25] J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement," arXiv, April 8, 2018. doi: 10.48550/arXiv.1804.02767.

[26] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv, April 22, 2020. doi: 10.48550/arXiv.2004.10934.

[27] X. Zhu, S. Lyu, X. Wang, and Q. Zhao, "TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios," in 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada: IEEE, October 2021, pp. 2778–2788. doi: 10.1109/ICCVW54120.2021.00312.

[28] C. Li et al., "YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications," arXiv, September 7, 2022. doi: 10.48550/arXiv.2209.02976.

[29] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, "YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors," in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada: IEEE, June 2023, pp. 7464–7475. doi: 10.1109/CVPR52729.2023.00721.

[30] S. Zhao, J. Zheng, S. Sun, and L. Zhang, "An Improved YOLO Algorithm for Fast and Accurate Underwater Object Detection," Symmetry, vol. 14, no. 8, Art. no. 8, August 2022, doi: 10.3390/sym14081669.

[31] Y. Li, X. Bai, and C. Xia, "An Improved YOLOV5 Based on Triplet Attention and Prediction Head Optimization for Marine Organism Detection on Underwater Mobile Platforms," JMSE, vol. 10, no. 9, p. 1230, September 2022, doi: 10.3390/jmse10091230.

[32] X. Zhai, H. Wei, Y. He, Y. Shang, and C. Liu, "Underwater Sea Cucumber Identification Based on Improved YOLOv5," Applied Sciences, vol. 12, no. 18, p. 9105, September 2022, doi: 10.3390/app12189105.

[33] A. Markus, G. Kecskemeti, and A. Kertesz, "Flexible Representation of IoT Sensors for Cloud Simulators," in 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), March 2017, pp. 199–203. doi: 10.1109/PDP.2017.87.

[34] J. Chen et al., "Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks," arXiv.org .Accessed:November 8, 2023. [Online]. Available: https://arxiv.org/abs/2303.03667v3

[35] X. Zhang, X. Zhou, M. Lin, and J. Sun, "ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices," presented at Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6848–6856. Accessed: March 15, 2024. [Online] .Available: https://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_ShuffleNet_An_Extremely_CVPR_2018_paper.html

[36] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, "GhostNet: More Features From Cheap Operations," presented at Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1580–1589. Accessed: March 15, 2024. [Online]. Available: https://openaccess.thecvf.com/content_CVPR_2020/html/Han_GhostNet_More_Features_From_Cheap_Operations_CVPR_2020_paper.html

[37] X. Glorot, A. Bordes, and Y. Bengio, "Deep Sparse Rectifier Neural Networks," in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, June 2011, pp. 315–323. Accessed: March 15, 2024. [Online]. Available: https://proceedings.mlr.press/v15/glorot11a.html

[38] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "CBAM: Convolutional Block Attention Module," presented at Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3-19. Accessed: March 15, 2024 .[Online]. Available:https://openaccess.thecvf.com/content_ECCV_2018/html/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.html

[39] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533–536, October 1986, doi: 10.1038/323533a0.

[40] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression," presented at Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp.658–666.Accessed: March 15,2024.[Online].Available:https://openaccess.thecvf.com/content_CVPR_2019/html/Rezatofighi_Generalized_Intersection_Over_Union_A_Metric_and_a_Loss_for_CVPR_2019_paper.html

[41] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, "Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, Art. no. 07, April 2020, doi: 10.1609/aaai. v34i07.6999.

[42] Z. Zheng et al., "Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation," IEEE Transactions on Cybernetics, vol. 52, no. 8, pp. 8574–8586, August 2022, doi: 10.1109 / TCYB.2021.3095305.

Downloads

Published

05-12-2024

Issue

Section

Articles

How to Cite

Jiang, L., Yang, Y., Liu, E., Hu, L., Xu, J., Chen, L., Zhang, J., Liu, P., & Long, W. (2024). Improved YOLOV7-TINY Network for Sea Bream Detection. Journal of Computing and Electronic Information Management, 15(2), 55-65. https://doi.org/10.54097/c6kqnn36