Objective
This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research.
Materials and Methods
We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers.
Results
DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a “long tail” of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific.
Discussion
Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning).
Conclusion
Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field.
PLOS ONE | https://doi.org/10.1371/journal.pone.0226718 January 7, 2020 1 / 12 OPEN ACCESS Citation: Wang Q, Reps JM, Kostka KF, Ryan PB, Zou Y, Voss EA, et al. (2020) Development and validation of a prognostic model predicting symptomatic hemorrhagic transformation in acute ischemic stroke at scale in the OHDSI network. PLoS ONE 15(1): e0226718. https://doi.org/ 10.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.