Sample and Structure-Guided Network for Road Crack Detection

As an indispensable task for traffic management department, road maintenance has attracted much attention during the last decade due to the rapid development of traffic network. As is known, crack is the early form of many road damages, and repair it in time can significantly save the maintenance co...

Full description

Bibliographic Details
Main Authors: Siyuan Wu, Jie Fang, Xiangtao Zheng, Xijie Li
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8835080/
Description
Summary:As an indispensable task for traffic management department, road maintenance has attracted much attention during the last decade due to the rapid development of traffic network. As is known, crack is the early form of many road damages, and repair it in time can significantly save the maintenance cost. In this case, how to detect crack regions quickly and accurately becomes a huge demand. Actually, many image processing technique based methods have been proposed for crack detection, but their performances can not meet our expectations. The reason is that, most of these methods use bottom features such as color and texture to detect the cracks, which are easily influenced by the varied conditions such as light and shadow. Inspired by the great successes of machine learning and artificial intelligence, this paper presents a sample and structure guided network for detecting road cracks. Specifically, the proposed network is based on U-Net architecture, which remains the details from input to output by using skip connection strategy. Then, because the scale of crack samples is much smaller than that of non-crack ones, directly using the conventional cross entropy loss can not optimize the network effectively. In this case, the Focal loss is utilized to address the model optimization problem. Additionally, we incorporate the self-attention strategy into the proposed network, which enhances its stability by encoding the 2-order information among different local regions into the final features. Finally, we test the proposed method on four datasets, three public ones with labels and a photographed one without labels, to validate its effectiveness. It is noteworthy that, for the photographed dataset, we design a series of image processing strategies such as contrast enhancement to improve the generalization capability of the proposed method.
ISSN:2169-3536