Background
Reliably abstracting outcomes from free-text electronic medical records remains a challenge. While automated classification of free text has been a popular medical informatics topic, performance validation using real-world clinical data has been limited. The two main approaches are linguistic (natural language processing [NLP]) and statistical (machine learning). The authors have developed a hybrid system for abstracting computed tomography (CT) reports for specified outcomes.
Objectives
The objective was to measure performance of a hybrid NLP and machine learning system for automated outcome classification of emergency department (ED) CT imaging reports. The hypothesis was that such a system is comparable to medical personnel doing the data abstraction.
Methods
A secondary analysis was performed on a prior diagnostic imaging study on 3,710 blunt facial trauma victims. Staff radiologists dictated CT reports as free text, which were then deidentified. A trained data abstractor manually coded the reference standard outcome of acute orbital fracture, with a random subset double-coded for reliability. The data set was randomly split evenly into training and testing sets. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for certainty and temporal status. Findings were filtered for low certainty and past/future modifiers and then combined with the manual reference standard to generate decision tree classifiers using data mining tools Waikato Environment for Knowledge Analysis (WEKA) 3.7.5 and Salford Predictive Miner 6.6. Performance of decision tree classifiers was evaluated on the testing set with or without NLP processing.
Results
The performance of machine learning alone was comparable to prior NLP studies (sensitivity = 0.92, specificity = 0.93, precision = 0.95, recall = 0.93, f-score = 0.94), and the combined use of NLP and machine learning shows further improvement (sensitivity = 0.93, specificity = 0.97, precision = 0.97, recall = 0.96, f-score = 0.97). This performance is similar to, or better than, that of medical personnel in previous studies.
Conclusions
A hybrid NLP and machine learning automated classification system shows promise in coding free-text electronic clinical data.
Targeted interventions can significantly reduce hypothermia in otherwise healthy LPIs and/or LBW newborns and allow them to safely remain in a mother-infant unit. If applied broadly, such preventive practices could decrease preventable hypothermia in high-risk populations.
Hierarchical logistic regression analyses revealed that those with greater income (OR = 0.79, p = .043), and greater pain interference (OR = 0.79, p = .042), were less likely to have non-treatment of pain. Given that people with dementia often report less pain interference, we examined if other factors, such as depression, influenced the relationship between pain interference and pain non-treatment. A significant interaction between pain-interference and depression predicted the non-treatment of pain, indicating that those with less pain interference were more likely to have non-treatment of pain (OR = 1.04, p = .040), but only if they had lower levels of depression. Pain may get less attention among these individuals because distress is less visible. When older adults with dementia present with lower levels of distress, care partners and medical providers may miss pain-related concerns. To improve pain recognition and treatment, additional training on how to identify pain and distress through behavioral observations is needed. Early identification may reduce the daily burden of pain, decrease the likelihood of more serious healthcare problems, and limit hospital admissions for this population. Support: NIH/NINR Grant R01-NR014657-01A1), MIRECC, VA HSRD IQUEST. (165) Sleep disturbance mediates the association between social discrimination and pain severity in knee osteoarthritis
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.