Extracting safety information from multi-lingual accident reports using an ontology-based approach

Safety occurrence reports can contain valuable information on how incidents occur, revealing knowledge that can assist safety practitioners. This paper presents and discusses a literature review exploring how Natural Language Processing (NLP) has been applied to occurrence reports within safety-critical industries, informing further research on the topic and highlighting common challenges. Some of the uses of NLP include the ability for occurrence reports to be automatically classified against categories, and entities such as causes and consequences to be extracted from the text as well as the semantic searching of occurrence databases. The review revealed that machine learning models form the dominant method when applying NLP, although rule-based algorithms still provide a viable option for some entity extraction tasks. Recent advances in deep learning models such as Bidirectional Transformers for Language Understanding are now achieving a high accuracy while eliminating the need to substantially pre-process text. The construction of safety-themed datasets would be of benefit for the application of NLP to occurrence reporting, as this would allow the fine-tuning of current language models to safety tasks. An interesting approach is the use of topic modelling, which represents a shift away from the prescriptive classification taxonomies, splitting data into “topics”. Where many papers focus on the computational accuracy of models, they would also benefit from real-world trials to further inform usefulness. It is anticipated that NLP will soon become a mainstream tool used by safety practitioners to efficiently process and gain knowledge from safety-related text.

show abstract

“…Methods that explicitly state the development of an ontology. Ontologies generally describe taxonomic relationships [17].…”

Section: Ontologymentioning

confidence: 99%

A Scoping Literature Review of Natural Language Processing Application to Safety Occurrence Reports

Ricketts

Barry

Guo

et al. 2023

Safety

View full text Add to dashboard Cite

show abstract

“…Zhou and Lei explored the paths between latent and active errors for 407 railway accidents/incidents by using the Human Factors Analysis and Classification System [ 11 ]. Hughes et al described a multi-lingual ontology to identify specific classes of railway safety incident based on 5065 safety incident reports [ 12 ]. Drawing on existing research findings, this study conducts the automated analysis of bridge operational accidents based on the collected bridge operational accidents.…”

Section: Literature Reviewmentioning

confidence: 99%

Knowledge graph and CBR-based approach for automated analysis of bridge operational accidents: Case representation and retrieval

Xu,

Wei,

Cai

et al. 2023

PLoS ONE

View full text Add to dashboard Cite

Bridge operational accident analysis is a critical process in bridge operational risk management. It provides valuable knowledge support for responding to newly occurring accidents. However, there are three issues: (1) research specifically focused on the past bridge operational accidents is relatively scarce; (2) there is a lack of mature research findings regarding the bridge operational accidents knowledge representation; and (3) in similar case retrieval, while case-based reasoning (CBR) is a valuable approach, there are still some challenges and limitations associated with its usage. To tackle these problems, this research proposed an automated analysis approach for bridge operational accidents based on a knowledge graph and CBR. The approach includes case representation and case retrieval, leveraging advancements in computer science and artificial intelligence. In the proposed approach, the case representation involves the adoption of a knowledge graph to construct multi-dimensional networks. The knowledge graph captures the relationships between various factors and entities, allowing for a comprehensive representation of accidents domain knowledge. In the case retrieval, a multi-circle layer retrieval strategy was innovatively proposed to enhance retrieval efficiency. Three target cases were randomly selected to verify the validity of the proposed methodology. The combination of a knowledge graph and CBR can indeed provide useful tools for the automated analysis of bridge operational accidents. Additionally, the proposed methodology can serve as a reference for intelligent risk management in other types of infrastructures.

show abstract

“…A domain-specific ontology was developed, which employed NLP to extract subject, predicate, and object from unstructured textual data to improve human communication in aviation (Abdullah et al, 2019). Hughes et al (2019) developed an ontology-based approach capable of using multiple languages (German, French, or Italian) to identify safety incidents on railways, such as falling of passengers and being stuck by doors. A framework consisting of ontology and NLP was proposed to automate literature knowledge from abstract instead of bibliometric analysis, which is only limited to critical phrases such as authors, publications, journals, and citations.…”

Section: Introductionmentioning

confidence: 99%

Textual data transformations using natural language processing for risk assessment

et al. 2023

View full text Add to dashboard Cite

Underlying information about failure, including observations made in free text, can be a good source for understanding, analyzing, and extracting meaningful information for determining causation. The unstructured nature of natural language expression demands advanced methodology to identify its underlying features. There is no available solution to utilize unstructured data for risk assessment purposes. Due to the scarcity of relevant data, textual data can be a vital learning source for developing a risk assessment methodology. This work addresses the knowledge gap in extracting relevant features from textual data to develop cause–effect scenarios with minimal manual interpretation. This study applies natural language processing and text‐mining techniques to extract features from past accident reports. The extracted features are transformed into parametric form with the help of fuzzy set theory and utilized in Bayesian networks as prior probabilities for risk assessment. An application of the proposed methodology is shown in microbiologically influenced corrosion‐related incident reports available from the Pipeline and Hazardous Material Safety Administration database. In addition, the trained named entity recognition (NER) model is verified on eight incidents, showing a promising preliminary result for identifying all relevant features from textual data and demonstrating the robustness and applicability of the NER method. The proposed methodology can be used in domain‐specific risk assessment to analyze, predict, and prevent future mishaps, ameliorating overall process safety.

show abstract

Extracting safety information from multi-lingual accident reports using an ontology-based approach

Cited by 37 publications

References 37 publications

A Scoping Literature Review of Natural Language Processing Application to Safety Occurrence Reports

A Scoping Literature Review of Natural Language Processing Application to Safety Occurrence Reports

Knowledge graph and CBR-based approach for automated analysis of bridge operational accidents: Case representation and retrieval

Textual data transformations using natural language processing for risk assessment

Contact Info

Product

Resources

About