2021
DOI: 10.1111/mice.12793
|View full text |Cite
|
Sign up to set email alerts
|

A deep learning‐based image captioning method to automatically generate comprehensive explanations of bridge damage

Abstract: Photographs of bridges can reveal considerable technical information such as the part of the structure that is damaged and the type of damage. Maintenance and inspection engineers can benefit greatly from a technology that can automatically extract and express such information in readable sentences. This is possibly the first study on developing a deep learning model that can generate sentences describing the damage condition of a bridge from images through an image captioning method. Our study shows that by i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
41
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 86 publications
(42 citation statements)
references
References 69 publications
0
41
0
1
Order By: Relevance
“…An example of a study that combines visual and language data is the image captioning model developed for bridge damages by Chun et al. (2022). The present study compared and analyzed the differences between the performances of single‐modal and multimodal prompts.…”
Section: Results and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…An example of a study that combines visual and language data is the image captioning model developed for bridge damages by Chun et al. (2022). The present study compared and analyzed the differences between the performances of single‐modal and multimodal prompts.…”
Section: Results and Analysismentioning
confidence: 99%
“…One of the recent solutions to this problem is to deploy multimodal data (both images and texts) or derive one type of data from the other, instead of choosing either one of them. An example of a study that combines visual and language data is the image captioning model developed for bridge damages by Chun et al (2022). The present study compared and analyzed the differences between the performances of single-modal and multimodal prompts.…”
Section: Single-modal Prompts Versus Multimodal Promptsmentioning
confidence: 99%
“…An approach to obtain output that takes into account the relationship between member and damage names has been studied using an image captioning model [10]. In that study, a model is developed to output sentences regarding the damage and the member in which the damage occurs from bridge images, making it possible to obtain information that includes the relationships between words.…”
Section: Introductionmentioning
confidence: 99%
“…In the structural dynamics field, ANN has been successfully adopted in the response prediction and performance evaluation from the vibration signals (Perez‐Ramirez et al., 2019; Y. Xu et al., 2021; Z. Xu et al., 2022). Although these networks have the inherent ability to simulate complex systems with high fidelity (Chun et al., 2022; Hornik et al., 1989; Kuok & Yuen, 2021), there exist challenges in efficient and accurate training deep networks for long‐time dependent problems (e.g., nonlinear flutter behavior; T. Wu & Kareem, 2014b). To address this issue, the long short‐term memory (LSTM) cell (Hochreiter & Schmidhuber, 1997) has been successfully employed in a recurrent neural network (RNN) architecture to simulate nonlinear unsteady aerodynamics (K. Li et al., 2019; T. Li et al., 2020; W. Li et al., 2020).…”
Section: Introductionmentioning
confidence: 99%