Inception meets Swin Transformer: A Novel Approach for Metal Defect Recognition


  • Donglin Tang
  • Yunliang Zhao



Metal Defect, Inception, Swin Transformer, Signals in grayscale images.


The detection of metal defects with high precision and efficiency is a significant challenge in modern industry. Existing machine learning methods for recognizing common metal surface defects heavily rely on expert knowledge for manual feature extraction. Conventional deep learning methods face challenges in capturing global feature information from defect images or defect detection signals.To address this issue, we proposed a metal defect recognition method based on an Inception-fused Swin Transformer model. The method combines the adaptive local feature extraction capability of the Inception structure with the advantage of the Swin Transformer in capturing global feature information from defect signals. Additionally, it utilizes the Channel-Coordinate Attention module (CoordAttention) to highlight important feature channels. Experimental results demonstrate the effectiveness of the proposed method on the Ultrasonic Defect Grayscale Image dataset (ULFSL-DET) and the publicly available Image-based Metal Defect dataset (NEU-CLS), achieving recognition accuracies of 98.1% and 99.8%, respectively. The method exhibits high effectiveness in recognizing metal defect signals in grayscale images, and it demonstrates strong generality for image-based metal defect recognition.


Download data is not yet available.


Czimmermann T, Ciuti G, Milazzo M, Chiurazzi M, Roccella S, Oddo CM, Dario P (2020) Visual-Based Defect Detection and Classification Approaches for Industrial Applications-A SURVEY. Sensors 20:1459.

Fang X, Luo Q, Zhou B, Li C, Tian L (2020) Research Progress of Automated Visual Surface Defect Detection for Industrial Metal Planar Materials. Sensors 20:5136.

Ming-Jian S, Ting L, Xing-Zhen C, De-Ying C, Feng-Gang Y, Nai-Zhang F (2016) Nondestructive detecting method for metal material defects based on multimodal signals. Acta Phys Sin 65:167802.

Mensah A, Sriramula S (2022) Machine learning based integrity decision management of pipeline corrosion clusters. In: 2022 International Conference on Decision Aid Sciences and Applications (dasa). Ieee, New York, pp 795–799

Lin J, Yang J, Huang Y, Lin X (2021) Defect identification of metal additive manufacturing parts based on laser-induced breakdown spectroscopy and machine learning. Appl Phys B-Lasers Opt 127:173.

Gaja H, Liou F (2018) Defect classification of laser metal deposition using logistic regression and artificial neural networks for pattern recognition. Int J Adv Manuf Technol 94:315–326.

Zhang Y, Chan W, Jaitly N (2017) Very Deep Convolutional Networks for End-to-End Speech Recognition. In: 2017 Ieee International Conference on Acoustics, Speech and Signal Processing (icassp). Ieee, New York, pp 4845–4849

Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going Deeper with Convolutions. In: 2015 Ieee Conference on Computer Vision and Pattern Recognition (cvpr). Ieee, New York, pp 1–9

He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 770–778

Balcioglu YS, Sezen B, Gok MS, Tunca S (2022) Image Processing with Deep Learning: Surface Defect Detection of Metal Gears Through Deep Learning. Mater Eval 80:44–53.

Meng T, Tao Y, Chen Z, Avila JRS, Ran Q, Shao Y, Huang R, Xie Y, Zhao Q, Zhang Z, Yin H, Peyton AJ, Yin W (2021) Depth Evaluation for Metal Surface Defects by Eddy Current Testing Using Deep Residual Convolutional Neural Networks. IEEE Trans Instrum Meas 70:2515413.

He D, Xu K, Wang D (2019) Design of multi-scale receptive field convolutional neural network for surface inspection of hot rolled steels. Image Vis Comput 89:12–20.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention Is All You Need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems 30 (nips 2017). Neural Information Processing Systems (nips), La Jolla

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Zhou D, Kang B, Jin X, Yang L, Lian X, Jiang Z, Hou Q, Feng J (2021) DeepViT: Towards Deeper Vision Transformer

Touvron H, Cord M, Sablayrolles A, Synnaeve G, Jégou H (2021) Going Deeper With Image Transformers. pp 32–42

Wang W, Xie E, Li X, Fan D-P, Song K, Liang D, Lu T, Luo P, Shao L (2022) PVT v2: Improved baselines with Pyramid Vision Transformer. Comput Vis Media 8:415–424.

Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In: 2021 Ieee/Cvf International Conference on Computer Vision (iccv 2021). Ieee, New York, pp 9992–10002

Hou Q, Zhou D, Feng J (2021) Coordinate Attention for Efficient Mobile Network Design. In: 2021 Ieee/Cvf Conference on Computer Vision and Pattern Recognition, Cvpr 2021. Ieee Computer Soc, Los Alamitos, pp 13708–13717

Hu J, Shen L, Sun G (2018) Squeeze-and-Excitation Networks. In: 2018 Ieee/Cvf Conference on Computer Vision and Pattern Recognition (cvpr). Ieee, New York, pp 7132–7141

Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: Convolutional Block Attention Module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision - Eccv 2018, Pt Vii. Springer International Publishing Ag, Cham, pp 3–19

He Y, Song K, Meng Q, Yan Y (2020) An End-to-End Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features. IEEE Trans Instrum Meas 69:1493–1504.

Dragomiretskiy K, Zosso D (2014) Variational Mode Decomposition. IEEE Trans Signal Process 62:531–544.







How to Cite

Inception meets Swin Transformer: A Novel Approach for Metal Defect Recognition. (2024). Academic Journal of Science and Technology, 9(1), 176-185.

Similar Articles

1-10 of 90

You may also start an advanced similarity search for this article.