Medical applications challenge today's text categorization techniques by demanding both high accuracy and ease-of-interpretation. Although deep learning has provided a leap ahead in accuracy, this leap comes at the sacrifice of interpretability. To address this accuracy-interpretability challenge, we here introduce, for the first time, a text categorization approach that leverages the recently introduced Tsetlin Machine. In all brevity, we represent the terms of a text as propositional variables. From these, we capture categories using simple propositional formulae, such as: if "rash" and "reaction" and "penicillin" then Allergy. The Tsetlin Machine learns these formulae from a labelled text, utilizing conjunctive clauses to represent the particular facets of each category. Indeed, even the absence of terms (negated features) can be used for categorization purposes. Our empirical comparison with Naïve Bayes, decision trees, linear support vector machines (SVMs), random forest, long shortterm memory (LSTM) neural networks, and other techniques, is quite conclusive. The Tsetlin Machine either performs on par with or outperforms all of the evaluated methods on both the 20 Newsgroups and IMDb datasets, as well as on a non-public clinical dataset. On average, the Tsetlin Machine delivers the best recall and precision scores across the datasets. Finally, our GPU implementation of the Tsetlin Machine executes 5 to 15 times faster than the CPU implementation, depending on the dataset. We thus believe that our novel approach can have a significant impact on a wide range of text analysis applications, forming a promising starting point for deeper natural language understanding with the Tsetlin Machine.
Background Natural language processing (NLP) based clinical decision support systems (CDSSs) have demonstrated the ability to extract vital information from patient electronic health records (EHRs) to facilitate important decision support tasks. While obtaining accurate, medical domain interpretable results is crucial, it is demanding because real-world EHRs contain many inconsistencies and inaccuracies. Further, testing of such machine learning-based systems in clinical practice has received limited attention and are yet to be accepted by clinicians for regular use. Methods We present our results from the evaluation of an NLP-driven CDSS developed and implemented in a Norwegian Hospital. The system incorporates unsupervised and supervised machine learning combined with rule-based algorithms for clinical concept-based searching to identify and classify allergies of concern for anesthesia and intensive care. The system also implements a semi-supervised machine learning approach to automatically annotate medical concepts in the narrative. Results Evaluation of system adoption was performed by a mixed methods approach applying The Unified Theory of Acceptance and Use of Technology (UTAUT) as a theoretical lens. Most of the respondents demonstrated a high degree of system acceptance and expressed a positive attitude towards the system in general and intention to use the system in the future. Increased detection of patient allergies, and thus improved quality of practice and patient safety during surgery or ICU stays, was perceived as the most important advantage of the system. Conclusions Our combined machine learning and rule-based approach benefits system performance, efficiency, and interpretability. The results demonstrate that the proposed CDSS increases detection of patient allergies, and that the system received high-level acceptance by the clinicians using it. Useful recommendations for further system improvements and implementation initiatives are reducing the quantity of alarms, expansion of the system to include more clinical concepts, closer EHR system integration, and more workstations available at point of care.
A generic method for performing history matching of reservoir flow models using 4D seismic data is described. Several key technology components are used in the procedure, including seismic attribute analysis and classification, reservoir simulator technologies, domain transformation algorithms, as well as the optimization algorithms that guide the history-matching process. The novel 4D history-matching procedure was applied to the Tordis Field in the North Sea. As a result the transmissibilities of the fault network in the reservoir model were modified. In addition a production well was successfully side-tracked using results of the analysis from the time-lapse data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.