TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection

Cracks are widespread in infrastructure that are closely related to human activity. It is very popular to use artificial intelligence to detect cracks intelligently, which is known as crack detection. The noise in the background of crack images, discontinuity of cracks and other problems make the cr...

Full description

Bibliographic Details
Main Authors: Xiaochen Ju, Xinxin Zhao, Shengsheng Qian
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/10/13/2354
_version_ 1797434105895321600
author Xiaochen Ju
Xinxin Zhao
Shengsheng Qian
author_facet Xiaochen Ju
Xinxin Zhao
Shengsheng Qian
author_sort Xiaochen Ju
collection DOAJ
description Cracks are widespread in infrastructure that are closely related to human activity. It is very popular to use artificial intelligence to detect cracks intelligently, which is known as crack detection. The noise in the background of crack images, discontinuity of cracks and other problems make the crack detection task a huge challenge. Although many approaches have been proposed, there are still two challenges: (1) cracks are long and complex in shape, making it difficult to capture long-range continuity; (2) most of the images in the crack dataset have noise, and it is difficult to detect only the cracks and ignore the noise. In this paper, we propose a novel method called <i>Transformer-based Multi-scale Fusion Model</i> (TransMF) for crack detection, including an Encoder Module (EM), Decoder Module (DM) and Fusion Module (FM). The Encoder Module uses a hybrid of convolution blocks and Swin Transformer block to model the long-range dependencies of different parts in a crack image from a local and global perspective. The Decoder Module is designed with symmetrical structure to the Encoder Module. In the Fusion Module, the output in each layer with unique scales of Encoder Module and Decoder Module are fused in the form of convolution, which can release the effect of background noise and strengthen the correlations between relevant context in order to enhance the crack detection. Finally, the output of each layer of the Fusion Module is concatenated to achieve the purpose of crack detection. Extensive experiments on three benchmark datasets (CrackLS315, CRKWH100 and DeepCrack) demonstrate that the proposed TransMF in this paper exceeds the best performance of present baselines.
first_indexed 2024-03-09T10:27:30Z
format Article
id doaj.art-ba57899425034165923b4ffb3889d5d3
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-09T10:27:30Z
publishDate 2022-07-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-ba57899425034165923b4ffb3889d5d32023-12-01T21:35:44ZengMDPI AGMathematics2227-73902022-07-011013235410.3390/math10132354TransMF: Transformer-Based Multi-Scale Fusion Model for Crack DetectionXiaochen Ju0Xinxin Zhao1Shengsheng Qian2Railway Engineering Research Institute, China Academy of Railway Sciences Corporation Limited, Beijing 100081, ChinaRailway Engineering Research Institute, China Academy of Railway Sciences Corporation Limited, Beijing 100081, ChinaInstitute of Automation, Chinese Academy of Sciences, Beijing 100090, ChinaCracks are widespread in infrastructure that are closely related to human activity. It is very popular to use artificial intelligence to detect cracks intelligently, which is known as crack detection. The noise in the background of crack images, discontinuity of cracks and other problems make the crack detection task a huge challenge. Although many approaches have been proposed, there are still two challenges: (1) cracks are long and complex in shape, making it difficult to capture long-range continuity; (2) most of the images in the crack dataset have noise, and it is difficult to detect only the cracks and ignore the noise. In this paper, we propose a novel method called <i>Transformer-based Multi-scale Fusion Model</i> (TransMF) for crack detection, including an Encoder Module (EM), Decoder Module (DM) and Fusion Module (FM). The Encoder Module uses a hybrid of convolution blocks and Swin Transformer block to model the long-range dependencies of different parts in a crack image from a local and global perspective. The Decoder Module is designed with symmetrical structure to the Encoder Module. In the Fusion Module, the output in each layer with unique scales of Encoder Module and Decoder Module are fused in the form of convolution, which can release the effect of background noise and strengthen the correlations between relevant context in order to enhance the crack detection. Finally, the output of each layer of the Fusion Module is concatenated to achieve the purpose of crack detection. Extensive experiments on three benchmark datasets (CrackLS315, CRKWH100 and DeepCrack) demonstrate that the proposed TransMF in this paper exceeds the best performance of present baselines.https://www.mdpi.com/2227-7390/10/13/2354crack detectionconvolutional neural networktransformermulti-scale fusion
spellingShingle Xiaochen Ju
Xinxin Zhao
Shengsheng Qian
TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection
Mathematics
crack detection
convolutional neural network
transformer
multi-scale fusion
title TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection
title_full TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection
title_fullStr TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection
title_full_unstemmed TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection
title_short TransMF: Transformer-Based Multi-Scale Fusion Model for Crack Detection
title_sort transmf transformer based multi scale fusion model for crack detection
topic crack detection
convolutional neural network
transformer
multi-scale fusion
url https://www.mdpi.com/2227-7390/10/13/2354
work_keys_str_mv AT xiaochenju transmftransformerbasedmultiscalefusionmodelforcrackdetection
AT xinxinzhao transmftransformerbasedmultiscalefusionmodelforcrackdetection
AT shengshengqian transmftransformerbasedmultiscalefusionmodelforcrackdetection