Although natural language processing (NLP) tools have been available in English for quite some time, it is not the case for many other languages, particularly for context specific texts like clinical notes. This poses a challenge for tasks like text classification in languages other than English. In the absence of basic NLP tools, manually engineering features that capture semantic information of the documents is a potential solution. Nevertheless, it is very time consuming. Deep neural networks, particularly deep recurrent neural networks (RNN), have been proposed as Endto-End models that learn both features and parameters jointly, thus avoiding the need to manually encode the features. We compared the performance of two classifiers for labeling 14718 clinical notes in Spanish according to the patients' smoking status: a bag-of-words model involving heavy manual feature engineering and a bidirectional long-shortterm-memory (LSTM) deep recurrent neural network (RNN) with GloVe word embeddings. The RNN slightly outperforms the bag-of-words model, but with 80% less overall development time. Such algorithms can facilitate the exploitation of clinical notes in languages in which NLP tools are not as developed as in English.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.