DCSS-UNet: UNet based on State Space Model for Polyp Segmentation

Authors

  • Xiuwei Wang
  • Biyuan Li

DOI:

https://doi.org/10.54097/6m4zwb07

Keywords:

CNN, Mamba, Medical Image Segmentation, Vision State Space Models

Abstract

 Early and accurate segmentation of medical images can provide valuable information for medical treatment. In recent years, the automatic and accurate segmentation of polyps in colonoscopy images has received extensive attention from the research community of artificial intelligence and computer vision. Many researchers have conducted in-depth research on models based on CNN and Transformer. However, CNN have limited ability to model remote dependencies, which makes it challenging to fully utilize semantic information in images. On the other hand, the complexity of the secondary computation poses a challenge to the transformer. Recently, state-space models (SSMS), such as Mamba, have been recognized as a promising approach. They not only show superior performance in remote interaction, but also maintain linear computational complexity. Inspired by Mamba, we propose DCSS-UNet, where we utilize visual state space (VSS) blocks in VMamba to capture a wide range of contextual information. In the Skip connection phase, we propose Skip Connects Feature Attention modules(SFA) to better communicate information from the encoder. In the decoder stage, we innovatively combined the Temporal Fusion Attention Module(TFAM) to effectively fuse the feature information. In addition, we introduced a custom Loss calculation method, Tversky Loss, for the model to achieve faster convergence and improve segmentation along polyp boundaries. Our model was trained on the Kvasir-SEG and CVC-ClinicDB datasets, and validated on datasets Kvasir-SEG, CVC-ColonDB, CVC-300, and ETIS. The results show that the model achieves good segmentation accuracy and generalization performance with a low number of parameters. We are 6.1% ahead in the Kavirs-SEG dataset and 3.1% ahead in the CVC-ClinicDB dataset compared to VM-UNet.

Downloads

Download data is not yet available.

References

[1] Asgari Taghanaki, Saeid, Kumar Abhishek, Joseph Paul Cohen, Julien Cohen-Adad, and Ghassan Hamarneh. (2021) Deep semantic segmentation of natural and medical images: a review. Artificial Intelligence Review 54: 137-178.

[2] Siegel, Rebecca L., Kimberly D. Miller, Hannah E. Fuchs, and Ahmedin Jemal. (2022) Cancer statistics, 2022. CA: a cancer journal for clinicians72, no. 1.

[3] Rock, Cheryl L., Cynthia Thomson, Ted Gansler, Susan M. Gapstur, Marjorie L. McCullough, Alpa V. Patel, Kimberly S. Andrews et al. (2020)American Cancer Society guideline for diet and physical activity for cancer prevention. CA: a cancer journal for clinicians 70, no. 4: 245-271.

[4] Cheng, Jie-Zhi, Dong Ni, Yi-Hong Chou, Jing Qin, Chui-Mei Tiu, Yeun-Chung Chang, Chiun-Sheng Huang, Dinggang Shen, and Chung-Ming Chen. (2016) Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Scientific reports 6, no. 1: 24454.

[5] Aasma, Shaukat, Charles J. Kahi, Carol A. Burke, Linda Rabeneck, Bryan G. Sauer, and Douglas K. Rex. (2021)ACG Clinical Guidelines: Colorectal Cancer Screening 2021. The American Journal of Gastroenterology 116, no. 3: 458-479.

[6] Pacal, Ishak, Dervis Karaboga, Alper Basturk, Bahriye Akay, and Ufuk Nalbantoglu. (2020) A comprehensive review of deep learning in colon cancer. Computers in Biology and Medicine 126: 104003.

[7] Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. (2015) U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pp. 234-241. Springer International Publishing.

[8] Zhou, Zongwei, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, and Jianming Liang., (2018)Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, pp. 3-11. Springer International Publishing.

[9] Jha, Debesh, et al. (2019) Resunet++: An advanced architecture for medical image segmentation. 2019 IEEE international symposium on multimedia (ISM). IEEE.

[10] Nguyen, Dinh Cong, and Hoang Long Nguyen. (2024) PolyPooling: An accurate polyp segmentation from colonoscopy images. Biomedical Signal Processing and Control 92: 105979.

[11] Vaswani, Ashish, et al. (2017) Attention is all you need. Advances in neural information processing systems 30.

[12] Dosovitskiy, Alexey, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale."arXiv preprint arXiv:2010.11929.

[13] Kalman, Rudolph Emil. A new approach to linear filtering and prediction problems. (1960): 35-45.

[14] Gu, Albert, and Tri Dao. (2023) Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752.

[15] Zhu, Lianghui, et al. (2024) Vision mamba: Efficient visual representation learning with bidirectional state space model." arXiv preprint arXiv:2401.09417.

[16] Ruan, Jiacheng, and Suncheng Xiang. (2024) Vm-unet: Vision mamba unet for medical image segmentation. arXiv preprint arXiv:2402.02491.

[17] Fan, Deng-Ping, et al. (2020) Pranet: Parallel reverse attention network for polyp segmentation. International conference on medical image computing and computer-assisted intervention. Cham: Springer International Publishing.

[18] Chen, Jieneng, et al. (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306.

[19] Cao, Hu, et al. (2022) Swin-unet: Unet-like pure transformer for medical image segmentation. European conference on computer vision. Cham: Springer Nature Switzerland.

[20] Jha, Debesh, Pia H. Smedsrud, Michael A. Riegler, Pål Halvorsen, Thomas De Lange, Dag Johansen, and Håvard D. Johansen. (2020) Kvasir-seg: A segmented polyp dataset. In MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II 26, pp. 451-462. Springer International Publishing.

[21] Bernal, Jorge, F. Javier Sánchez, Gloria Fernández-Esparrach, Debora Gil, Cristina Rodríguez, and Fernando Vilariño (2015) WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized medical imaging and graphics 43: 99-111.

[22] Tajbakhsh N, Gurudu SR, Liang J (2016) Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans Med Imaging (TMI) 35(2):630–644. https://doi.org/10.1109/TMI.

[23] Silva, Juan, Aymeric Histace, Olivier Romain, Xavier Dray, and Bertrand Granado (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. International journal of computer assisted radiology and surgery 9 : 283-293.

[24] Liao, Ting-Yu, et al. (2022) HarDNet-DFUS: An enhanced harmonically-connected network for diabetic foot ulcer image segmentation and colonoscopy polyp segmentation. arXiv preprint arXiv:2209.07313.

[25] Sun, Ke, et al. (2019)Deep high-resolution representation learning for human pose estimation.Proceedings of the IEEE/ CVF conference on computer vision and pattern recognition.

[26] Zhao, Sijie, et al. (2023) Exchanging dual-encoder–decoder: A new strategy for change detection with semantic guidance and spatial localization. IEEE Transactions on Geoscience and Remote Sensing 61: 1-16.

[27] Abraham, Nabila, and Naimul Mefraz Khan. (2019) A novel focal tversky loss function with improved attention u-net for lesion segmentation. 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019). IEEE.

[28] Woo, Sanghyun, et al. (2018) Cbam: Convolutional block attention module." Proceedings of the European conference on computer vision (ECCV).

Downloads

Published

26-09-2024

Issue

Section

Articles

How to Cite

Wang, X., & Li , B. (2024). DCSS-UNet: UNet based on State Space Model for Polyp Segmentation. Frontiers in Computing and Intelligent Systems, 9(3), 32-39. https://doi.org/10.54097/6m4zwb07