MBA-Net: Masked Background Alignment for Robust RGB-Thermal Invisible Gas Segmentation
DOI:
https://doi.org/10.54097/n9b9xq31Keywords:
RGB-Thermal Segmentation, Invisible Gas Detection, Thermal Artifact SuppressionAbstract
Industrial safety monitoring requires reliable detection of fugitive gas emissions, which are typically invisible in RGB images but observable in thermal infrared imagery. RGB–thermal (RGB-T) fusion therefore provides a promising solution for area-based gas leak detection. However, existing CNN-based approaches are limited by local receptive fields, making it difficult to capture the diffuse and irregular structures of gas plumes. Although recent Transformer-based methods improve global context modeling, they often amplify background thermal artifacts, leading to high false-positive rates in complex industrial environments. To address these challenges, we propose MBA-Net, a dual-stream Transformer-based framework for robust RGB-T invisible gas segmentation. MBA-Net first employs a dual-stream backbone to extract multi-scale contextual features from RGB and thermal modalities. A Thermal Artifact Suppression Gate (TASG) is introduced to perform structure-guided suppression of thermally salient but structurally inconsistent background responses. To further reduce residual background bias, we design a Masked Background Alignment (MBA) loss that enforces cross-modal feature consistency in background regions during training, without introducing additional inference cost. Finally, a Confidence-Aware Refinement (CAR) module is proposed to adaptively enhance uncertain regions, improving the representation of diffuse gas boundaries and weak plume responses. Extensive experiments on the public Gas-DB benchmark demonstrate that MBA-Net achieves superior segmentation performance with competitive computational efficiency.
Downloads
References
[1] Chen, P.: Advancements and future outlook of safety monitoring, inspection and assessment technologies for oil and gas pipeline networks. J. Pipeline Sci. Eng. 100267 (2025) DOI: https://doi.org/10.1016/j.jpse.2025.100267
[2] Meng, X., et al.: Identification of thermal fault states in cable insulation sheaths based on gas sensor arrays. IEEE Trans. Dielectr. Electr. Insul. (2025) DOI: https://doi.org/10.1109/TDEI.2025.3572332
[3] Zhang, H., et al.: Predicting stomatal conductance of chili peppers using TPE-optimized LightGBM and SHAP feature analysis based on UAV hyperspectral, thermal infrared imagery, and meteorological data. Comput. Electron. Agric. 231, 110036 (2025) DOI: https://doi.org/10.1016/j.compag.2025.110036
[4] Chen, C., et al.: MPSUNet: A deep learning-based segmentation framework for methane plume detection with space-based hyperspectral and multispectral imagery. IEEE Trans. Geosci. Remote Sens. (2025) DOI: https://doi.org/10.1109/TGRS.2025.3563599
[5] Tang, Z., et al.: Revisiting RGBT tracking benchmarks from the perspective of modality validity: A new benchmark, problem, and solution. IEEE Trans. Image Process. (2025) DOI: https://doi.org/10.1109/TIP.2025.3611687
[6] Guo, W., Du, Y., Du, S.: LangGas: Introducing language in selective zero-shot background subtraction for semi-transparent gas leak detection with a new dataset. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 4490–4500 (2025) DOI: https://doi.org/10.1109/CVPRW67362.2025.00434
[7] Wang, M., et al.: Infrared imaging detection for hazardous gas leakage using background information and improved YOLO networks. Remote Sens. 17(6), 1030 (2025) DOI: https://doi.org/10.3390/rs17061030
[8] Zhou, X., et al.: AGFNet: Adaptive gated fusion network for RGB-T semantic segmentation. IEEE Trans. Intell. Transp. Syst. 26(5), 6477–6492 (2025) DOI: https://doi.org/10.1109/TITS.2025.3528064
[9] Kütük, Z., Algan, G.: Semantic segmentation for thermal images: A comparative survey. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 1–10 (2022) DOI: https://doi.org/10.1109/CVPRW56347.2022.00043
[10] Ha, Q., et al.: MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: IROS 2017, pp. 5108–5115 (2017) DOI: https://doi.org/10.1109/IROS.2017.8206396
[11] Deng, F., et al.: FEANet: Feature-enhanced attention network for RGB-thermal real-time semantic segmentation. In: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), pp. 4467–4474 (2021) DOI: https://doi.org/10.1109/IROS51168.2021.9636084
[12] Liang, M., et al.: Explicit attention-enhanced fusion for RGB-thermal perception tasks. IEEE Robot. Autom. Lett. 8(7), 4060–4067 (2023) DOI: https://doi.org/10.1109/LRA.2023.3272269
[13] Vaswani, A., et al.: Attention is all you need. In: Adv. Neural Inf. Process. Syst. (NeurIPS), pp. 5998–6008 (2017)
[14] Zhang, J., et al.: CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers. IEEE Trans. Intell. Transp. Syst. 24(12), 14679–14694 (2023) DOI: https://doi.org/10.1109/TITS.2023.3300537
[15] Shin, U., et al.: Complementary random masking for RGB-thermal semantic segmentation. In: Proc. IEEE Int. Conf. Robot. Autom. (ICRA), pp. 10947–10953 (2024) DOI: https://doi.org/10.1109/ICRA57147.2024.10611200
[16] Xie, E., et al.: SegFormer: Simple and efficient design for semantic segmentation with transformers. In: Adv. Neural Inf. Process. Syst. (NeurIPS), pp. 12077–12090 (2021)
[17] Wang, J., et al.: Invisible gas detection: An RGB-thermal cross attention network and a new benchmark. Comput. Vis. Image Underst. 248, 104099 (2024) DOI: https://doi.org/10.1016/j.cviu.2024.104099
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Frontiers in Computing and Intelligent Systems

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

