The past few years have seen the application of machine learning utilised in the exploration of materials. As in many fields of research—the vast majority of knowledge is published as text, which poses challenges in either a consolidated or statistical analysis across studies and reports. To address this issue, the application of natural language processing (NLP) has been explored in several studies to date. In the present work, we have employed the Word2Vec model, previously explored by others, and the BERT model—applying them towards the search for chromate replacements in the field of corrosion protection. From a database of over 80 million records, a down-selection of 5990 papers focused on the topic of corrosion protection were examined using NLP. This study demonstrates it is possible to extract knowledge from the automated interpretation of the scientific literature and achieve expert human-level insights.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.