A Dual-Task Cascade Network for Underwater Object Detection with Scene Adaptability

Authors

  • Haiyang Yao
  • Jinhao Shi
  • Yuzhang Zang
  • Xiaobo Zhao
  • Xiao Chen
  • Tao Lei
  • Haiyan Wang

DOI:

https://doi.org/10.54097/2amams14

Keywords:

Underwater Object Detection, Adaptive Detection Framework, Underwater Optical Image Processing

Abstract

Underwater small-object detection is often hindered by complex backgrounds and the difficulty of capturing fine-scale features. This paper addresses these challenges with three contributions: (1) A perturbation-theoretic model that rigorously quantifies background interference by decomposing loss functions into ideal and perturbation components. (2) The Dual-Task Cascade Network (DTC-Net), which classifies scenes (uniform, non-uniform, or enhanced) before activating scene-specific detection branches. (3) A Dual Attention Feature Pyramid Network (DAFPN) and a dynamic convolution Focus Module designed to recalibrate features and emphasize fine-grained details. Extensive experiments on UTDAC2020 show that DTC-Net achieves a state-of-the-art 48.1% AP, outperforming FCOS and D-FINE. Furthermore, the strategy generalizes well, improving existing detectors (e.g., GCC-Net) by up to 1.9% AP, demonstrating its effectiveness across various frameworks.

Downloads

Download data is not yet available.

References

[1] Igbinenikaro O P, Adekoya O O, Etukudoh E A. Emerging underwater survey technologies: a review and future outlook [J]. Open Access Research Journal of Science and Technology, 2024, 10(02): 071-084.

[2] Liu Z, Liu K, Chen X, et al. Deep-sea rock mechanics and mining technology: State of the art and perspectives[J]. International Journal of Mining Science and Technology, 2023, 33 (9): 1083-1115.

[3] Zhang B, Ji D, Liu S, et al. Autonomous underwater vehicle navigation: a review[J]. Ocean Engineering, 2023, 273: 113861.

[4] Chu S, Lin M, Li D, et al. Adaptive reward shaping based reinforcement learning for docking control of autonomous underwater vehicles[J]. Ocean Engineering, 2025, 318: 120139.

[5] Xu S, Zhang M, Song W, et al. A systematic review and analysis of deep learning-based underwater object detection[J]. Neurocomputing, 2023, 527: 204-232.

[6] Feng C, Zhong Y, Gao Y, et al. Tood: Task-aligned one-stage object detection[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society, 2021: 3490-3499.

[7] Zhao Y, Lv W, Xu S, et al. Detrs beat yolos on real-time object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 16965-16974.

[8] Sun P, Zhang R, Jiang Y, et al. Sparse r-cnn: End-to-end object detection with learnable proposals[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 14454-14463.

[9] Zhang H, Chang H, Ma B, et al. Dynamic R-CNN: Towards high quality object detection via dynamic training[C]// Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer International Publishing, 2020: 260-275.

[10] Chen X, Yuan M, Yang Q, et al. Underwater-YCC: underwater object detection optimization algorithm based on YOLOv7[J]. Journal of Marine Science and Engineering, 2023, 11(5): 995.

[11] Chen, X., Fan, C., Shi, J., Wang, H., Yao, H., 2024. Underwater object detection and embedded deployment based on lightweight YOLO_GN. The Journal of Supercomputing 80, 14057–14084.

[12] X. Chen, X. Chen, F. Wu, H. Wang, H. Yao, Online_xkd: Anonline knowledge distillation model for underwater object detection, Computers and Electrical Engineering, 2024, 119:109501.

[13] Pang J, Liu W, Zhang B, et al. MCNet: Magnitude consistency network for domain adaptive object detection under inclement environments[J]. Pattern Recognition, 2024, 145: 109947.

[14] Wang J, Wan M, Xu Y, et al. Underwater Image Restoration via Constrained Color Compensation and Background Light Color Space-based Haze-Line Model[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:1-15.

[15] Li C, Guo C, Ren W, et al. An underwater image enhancement benchmark dataset and beyond[J]. IEEE transactions on image processing, 2019, 29: 4376-4389.

[16] Cong R, Yang W, Zhang W, et al. Pugan: Physical model-guided underwater image enhancement using GAN with dual-discriminators[J]. IEEE Transactions on Image Processing, 2023, 32: 4472-4485.

[17] Yeh C H, Lin C H, Kang L W, et al. Lightweight deep neural network for joint learning of underwater object detection and color conversion[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(11): 6129-6143.

[18] Liu R, Jiang Z, Yang S, et al. Twin adversarial contrastive learning for underwater image enhancement and beyond[J]. IEEE Transactions on Image Processing, 2022, 31: 4922-4936.

[19] Islam M J, Xia Y, Sattar J. Fast underwater image enhancement for improved visual perception[J]. IEEE Robotics and Automation Letters, 2020, 5(2): 3227-3234.

[20] W. Lin, J. Zhong, S. Liu, T. Li, G. Li, Roimix: Proposal-fusion among multiple images for underwater object detection, in: ICASSP, 2020, 2588–2592.

[21] L. Chen, F. Zhou, S. Wang, J. Dong, N. Li, H. Ma, X. Wang, H. Zhou, Swipe-net: Object detection in noisy underwater scenes, Pattern Recognition, 2022, 132:108926.

[22] Jia J, Fu M, Liu X, et al. Underwater object detection based on improved Efficient-Det[J]. Remote Sensing, 2022, 14(18): 4487.

[23] P. Song, P. Li, L. Dai, T. Wang, Z. Chen, Boosting R-CNN: Reweighting R-CNN samples by RPN's error for underwater object detection, Neurocomputing, 2023, 530: 150–164.

[24] Liu H, Song P, Ding R. Towards domain generalization in underwater object detection[C]//2020 IEEE international conference on image processing (ICIP). IEEE, 2020: 1971-1975.

[25] Chen Y, Song P, Liu H, et al. Achieving domain generalization for underwater object detection by domain mix-up and contrastive learning[J]. Neurocomputing, 2023, 528: 20-34.

[26] Liu K, Peng L, Tang S. Underwater object detection using TC-YOLO with attention mechanisms[J]. Sensors, 2023, 23(5): 2567.

[27] Fu C, Fan X, Xiao J, et al. Learning heavily-degraded prior for underwater object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(11): 6887-6896.

[28] Dai L, Liu H, Song P, et al. A gated cross-domain collaborative network for underwater object detection[J]. Pattern Recognition, 2024, 149: 110222.

[29] Yuan J, Cai Z, Cao W. A Novel Underwater Detection Method for Ambiguous Object Finding via Distraction Mining[J]. IEEE Transactions on Industrial Informatics, 2024, 20(1): 123-134.

[30] H. Liu, P. Song, et al., Towards domain generalization in underwater object detection, ICIP, 2020, 1971–1975.

[31] Y. Chen, P. Song, et al., Achieving domain generalization for underwater object detection by domain mixup and contrastive learning, Neurocomputing, 2023, 528(1):20-34.

[32] C. Liu, Z. Wang, et al., A new dataset, Poisson GAN and Aqua-net for underwater object grabbing, IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(5): 2831–2844.

[33] H. Zhang, Y. Wang, F. Dayoub and N. Sünderhauf. VarifocalNet: An IoU-aware Dense Object Detector. CVPR 2021, Nashville, TN, USA, 2021, 8510-8519.

[34] Haiyang Y, Ruige G, Zhongda Z, et al. U-TransCNN: A U-shape transformer-CNN fusion model for underwater image enhancement[J]. Displays, 2025, 88: 103047.

[35] K. He, R. Girshick, and P. Dollar, “Rethinking Imagenet pre-training,” in Proc. IEEE Int. Conf. Comput. Vis., 2019, pp. 4917–4926.

[36] Q. Chen, Y. Wang, T. Yang, X. Zhang, J. Cheng, and J. Sun, “You only look one-level feature,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit, 2021, pp. 13039–13048.

[37] K. Chen, J. Wang, et al., MMDetection: Open mmlab detection toolbox and benchmark, arXiv preprint arXiv:1906.07155 (2019).

[38] T.-Y. Lin, M. Maire, Microsoft coco: Common objects in context, in: ECCV, 2014, pp. 740–755.

[39] Chen X, Ge X, Yang Q, et al. A Novel Underwater Blurred Target Detection Algorithm Based on RT‐DETR[J]. Concurrency and Computation: Practice and Experience, 2025, 37 (23-24): e70267.

[40] Tian Z, Shen C, Chen H, et al. FCOS: A simple and strong anchor-free object detector[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 44(4): 1922-1933.

[41] Wen J, Cui J, Zhao B, et al. EnYOLO: a real-time framework for domain-adaptive underwater object detection with image enhancement[C]//2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024: 12613-12619.

[42] Xu W, Wang C, Liang D, et al. NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding[J]. arXiv preprint arXiv:2510.27481, 2025.

[43] Zong Z, Song G, Liu Y. Detrs with collaborative hybrid assignments training[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2023: 6748-6758.

[44] An S, Xu L, Senior Member I, et al. HFM: A hybrid fusion method for underwater image enhancement[J]. Engineering Applications of Artificial Intelligence, 2024, 127: 107219.

[45] Peng Y, Li H, Wu P, et al. D-FINE: Redefine regression task in DETRs as fine-grained distribution refinement[J]. arXiv preprint arXiv:2410.13842, 2024.

Downloads

Published

29-01-2026

Issue

Section

Articles

How to Cite

Yao, H., Shi , J. ., Zang, Y., Zhao, X., Chen, X. ., Lei , T., & Wang, H. (2026). A Dual-Task Cascade Network for Underwater Object Detection with Scene Adaptability. Academic Journal of Science and Technology, 19(1), 95-104. https://doi.org/10.54097/2amams14