Milad Botros scite author profile

Milad Botros

2Publications

2Citation Statements Received

14Citation Statements Given

How they've been cited

How they cite others

Affiliations

Colorado Education Initiative

Publications

Order By: Most citations

Natural Language Understanding for Safety and Risk Management in Oil and Gas Plants

Milana¹,

Darena²,

Bettio³

et al. 2019

View full text Add to dashboard Cite

In Oil and Gas, technical staff is daily involved in critical activities. Safety is therefore a key priority, even more so with frontier and continuously-updating technologies acting as a fundamental part of the transformation of the traditional industrial processes. While safety reports and investigations have long been adequately stored and continuously monitored by expert professionals, Artificial Intelligence applications to natural language now provide the opportunity to develop a decision support system capable of extracting insights, predicting the risk of future operations, performing scenario analysis and prescribing risk mitigation actions on massive amounts of data. In this work, we used an Open Innovation approach to develop a Safety Pre-Sense system, leveraging Machine Learning and Natural Language Processing techniques as well as incorporating multiple different (and often unexpected) sources of information. Starting from standard Natural Language Processing tasks, we leverage linguistic patterns to build binary Document-Term Matrices. Operating on these Matrices, we implemented a Domain Keyword Extraction algorithm to extract words (or multi-words) that have high specificity. Our pipeline also provides a language-agnostic method to detect similarities between documents written in different languages and cluster them accordingly, in order to obtain clear descriptors that can be used to understand their meaning. To do so, we map our text in a high-dimension vector space where we apply cluster analysis to group documents that are semantically close into consistent and multilingual groups. We then extract, for each language, a list of domain keywords that characterize every cluster. Next, we identify similarities in the data in a completely data-driven manner, with the objective of extracting correlations between event features (such as geographical location and cause or type of event). As a result, we extract new aggregations of complex items such as severe Accidents or Work Processes. We also demonstrate how Correspondence Analysis and Pattern Mining algorithms are able to extract and visualize correlations between topics and events, leveraging a dynamic Qlik dashboard. Finally, we point at additional sources of information, both internal and external to our company, that can be used to enhance our analysis.

show abstract

Natural Language Processing Applications in Case-Law Text Publishing

Tarasconi¹,

Botros²,

Caserio³

et al. 2020

View full text Add to dashboard Cite

Processing case-law contents for electronic publishing purposes is a time-consuming activity that encompasses several sub-tasks and usually involves adding annotations to the original text. On the other hand, recent trends in Artificial Intelligence and Natural Language Processing enable the automatic and efficient analysis of big textual data. In this paper we present our Machine Learning solution to three specific business problems, regularly met by a real world Italian publisher in their day-to-day work: recognition of legal references in text spans, new content ranking by relevance, and text classification according to a given tree of topics. Different approaches based on BERT language model were experimented with, together with alternatives, typically based on Bag-of-Words. The optimal solution, deployed in a controlled production environment, was in two out of three cases based on fine-tuned BERT (for the extraction of legal references and text classification), while, in the case of relevance ranking, a Random Forest model, with hand-crafted features, was preferred. We will conclude by discussing the concrete impact, as perceived by the publisher, of the developed prototypes.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Milad Botros

Natural Language Understanding for Safety and Risk Management in Oil and Gas Plants

Natural Language Processing Applications in Case-Law Text Publishing

Contact Info

Product

Resources

About