Early and timely detection of surface damages is important for maintaining the functionality, reliability, and safety of concrete bridges. Recent advancement in convolution neural network has enabled the development of deep learning‐based visual inspection techniques for detecting multiple structural damages. However, most deep learning‐based techniques are built on two‐stage, proposal‐driven detectors using less complex image data, which could be restricted for practical applications and possible integration within intelligent autonomous inspection systems. In this study, a faster, simpler single‐stage detector is proposed based on a real‐time object detection technique, You Only Look Once (YOLOv3), for detecting multiple concrete bridge damages. A field inspection images dataset labeled with four types of concrete damages (crack, pop‐out, spalling, and exposed rebar) is used for training and testing of YOLOv3. To enhance the detection accuracy, the original YOLOv3 is further improved by introducing a novel transfer learning method with fully pretrained weights from a geometrically similar dataset. Batch renormalization and focal loss are also incorporated to increase the accuracy. Testing results show that the improved YOLOv3 has a detection accuracy of up to 80% and 47% at the Intersection‐over‐Union (IoU) metrics of 0.5 and 0.75, respectively. It outperforms the original YOLOv3 and the two‐stage detector Faster Region‐based Convolutional Neural Network (Faster R‐CNN) with ResNet‐101, especially for the IoU metric of 0.75.
Deep learning techniques have attracted significant attention in the field of visual inspection of civil infrastructure systems recently. Currently, most deep learning-based visual inspection techniques utilize a convolutional neural network to recognize surface defects either by detecting a bounding box of each defect or classifying all pixels on an image without distinguishing between different defect instances. These outputs cannot be directly used for acquiring the geometric properties of each individual defect in an image, thus hindering the development of fully automated structural assessment techniques. In this study, a novel fully convolutional model is proposed for simultaneously detecting and grouping the image pixels for each individual defect on an image. The proposed model integrates an optimized mask subnet with a box-level detection network, where the former outputs a set of position-sensitive score maps for pixel-level defect detection and the latter predicts a bounding box for each defect to group the detected pixels. An image dataset containing three common types of concrete defects, crack, spalling and exposed rebar, is used for training and testing of the model. Results demonstrate that the proposed model is robust to various defect sizes and shapes and can achieve a mask-level mean average precision ( mAP) of 82.4% and a mean intersection over union ( mIoU) of 75.5%, with a processing speed of about 10 FPS at input image size of 576 × 576 when tested on an NVIDIA GeForce GTX 1060 GPU. Its performance is compared with the state-of-the-art instance segmentation network Mask R-CNN and the semantic segmentation network U-Net. The comparative studies show that the proposed model has a distinct defect boundary delineation capability and outperforms the Mask R-CNN and the U-Net in both accuracy and speed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.