Multi-Scale Adaptive Road Extraction Network for High-Resolution Remote Sensing Images
DOI:
https://doi.org/10.54097/cp052n67Keywords:
Remote Sensing Image, Road Extraction, Semantic Segmentation, Atrous Pyramid Convolution, Attention MechanismsAbstract
Road extraction from high-resolution remote sensing images faces challenges such as discontinuities caused by occlusions from trees and buildings, as well as computational complexity arising from varying road orientations and widths. In order to solve these problems, this paper proposes a Multi-Scale Adaptive Road Extraction Network (MARENet), which takes ConvNeXt as the backbone and designs the Atrous Pyramid Convolution (APC) module, which can accurately extract the roads in the image, and the Frequency Adaptive Transformer (FAT) module, which can focus on the important features in the image and avoid the problem of incomplete extraction. The network can effectively solve the above two key problems and achieve accurate road extraction. EARENet was evaluated experimentally on a commonly used and challenging DeepGlobe dataset. The proposed method is superior to the existing methods in terms of accuracy, precision, recall, F1-score and IoU, and the results show that the proposed network has robustness and superiority in processing high-resolution road remote sensing images.
Downloads
References
[1] Li Y H, Wang M, Su X P, et al. Road extraction from remote sensing images combining attention and context fusion[J/OL]. Journal of Jilin University(Engineering and Technology Edition), 1-10[2024-10-07].https://doi.org/ 10.13229/j. cnki. jdxbgxb. 20240442.
[2] Mo S, Shi Y, Yuan Q, et al. A Survey of Deep Learning Road Extraction Algorithms Using High-Resolution Remote Sensing Images[J]. Sensors, 2024, 24(5): 1708.
[3] Zhao L, Guo D D, Wang Q Q, et al. Deep learning based road extraction from remote sensing images[J]. Modern Electronics Technique, 2023,46(23):48-54. DOI:10.16652/j.issn.1004-373x. 2023. 23.009.
[4] Li K, Tan M, Xiao D, et al. Research on road extraction from high-resolution remote sensing images based on improved UNet++[J]. IEEE Access, 2024.
[5] Zhou L, Zhang C, Wu M. D-LinkNet: LinkNet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction[C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2018: 182-186.
[6] Liu X, Wu Y, Liang W, et al. High resolution SAR image classification using global-local network structure based on vision transformer and CNN[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.
[7] Tao J, Chen Z, Sun Z, et al. SEG-Road: a segmentation network for road extraction based on transformer and CNN with connectivity structures[J]. Remote Sensing, 2023, 15(6): 1602.
[8] Liu Z, Hu H, Lin Y, et al. Swin transformer v2: Scaling up capacity and resolution[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 12009-12019.
[9] Zhu X, Huang X, Cao W, et al. Road Extraction from Remote Sensing Imagery with Spatial Attention Based on Swin Transformer[J]. Remote Sensing, 2024, 16(7): 1183.
[10] Wang J, Zeng Z, Sharma P K, et al. Dual-path network combining CNN and transformer for pavement crack segmentation [J]. Automation in Construction, 2024, 158: 105217.
[11] Li J, Xia X, Li W, et al. Next-vit: Next generation vision transformer for efficient deployment in realistic industrial scenarios[J]. arXiv preprint arXiv:2207.05501, 2022.
[12] Hu X, Zhong B, Liang Q, et al. Transformer tracking via frequency fusion[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 34(2): 1020-1031.
[13] Liu Z, Mao H, Wu C Y, et al. A convnet for the 2020s[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 11976-11986.
[14] Zhang W, Zhao W, Li J, et al. CVANet: Cascaded visual attention network for single image super-resolution[J]. Neural Networks, 2024, 170: 622-634.
[15] Huan H, Zhang B. FDAENet: frequency domain attention encoder-decoder network for road extraction of remote sensing images[J]. Journal of Applied Remote Sensing, 2024, 18(2): 024510-024510.
[16] Li R, Wang L, Zhang C, et al. A2-FPN for semantic segmentation of fine-resolution remotely sensed images[J]. International journal of remote sensing, 2022, 43(3): 1131-1155.
[17] Li R, Zheng S, Zhang C, et al. ABCNet: Attentive bilateral contextual network for efficient semantic segmentation of Fine-Resolution remotely sensed imagery[J]. ISPRS journal of photogrammetry and remote sensing, 2021, 181: 84-98.
[18] Bo W, Liu J, Fan X, et al. BASNet: Burned area segmentation network for real-time detection of damage maps in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-13.
[19] Guo Z, Bian L, Huang X, et al. DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation[J]. arXiv preprint arXiv:2406.03702, 2024.
[20] Mehta D, Skliar A, Ben Yahia H, et al. Simple and efficient architectures for semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 2628-2636.
[21] Wang L, Li R, Zhang C, et al. UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196-214.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Frontiers in Computing and Intelligent Systems

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.