Measuring Exposure to Incarceration Using the Electronic Health Record

Wang, Emily A.; Long, Jessica B.; McGinnis, Kathleen A.; Wang, Karen; Wildeman, Christopher; Kim, Clara; Bucklen, Kristofer Bret; Fiellin, David A.; Bates, Jonathan; Brandt, Cynthia; Justice, Amy C.

doi:10.1097/mlr.0000000000001049

Cited by 17 publications

(14 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The Clinical-Longformer model developed in this study utilizing deep learning elements offers improvement over previous methods of identification such as the rule-based YTEX model (F1: 0.75), specifically in sensitivity and overall F1 score. 18 Additionally, the utilization of a larger training set of 800 unique clinician notes compared to the 228 used in Wang et al, as well as the use of the Clinical-Longformer to improve the attention and analysis over longer notes, likely contributed to the improvement in this NLP model.…”

Section: Discussionmentioning

confidence: 99%

“…In addition, the YTEX NLP tool is an example of a rule-based NLP in comparison to deep learning techniques for NLP that are able to handle the variability and diversity of human language better in settings utilizing unstructured data, such as clinician notes from the ED. 18 Boch et al proposed a BERT-based model that examined overall parental justice involvement among the pediatric population, demonstrating the utility of NLP in the identification and exploration of justice involvement. 20 However, there currently is no tool which identifies an individual’s own history of incarceration and timing of the event based on unstructured clinical encounter notes.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Identifying Incarceration Status in the Electronic Health Record Using Natural Language Processing in Emergency Department Settings

Huang,

Socrates,

Gilson

et al. 2023

Preprint

Self Cite

View full text Add to dashboard Cite

BackgroundIncarceration is a highly prevalent social determinant of health associated with high rates of morbidity and mortality and racialized health inequities. Despite this, incarceration status is largely invisible to health services research due to poor electronic health record capture within clinical settings. Our primary objective is to develop and assess natural language processing (NLP) techniques for identifying incarceration status from clinical notes to improve clinical sciences and delivery of care for millions of individuals impacted by incarceration.MethodsWe annotated 1,000 unstructured clinical notes randomly selected from the emergency department for incarceration history. Of these annotated notes, 80% were used to train the Longformer-based and RoBERTa NLP models. The remaining 20% served as the test set. Model performance was evaluated using accuracy, sensitivity, specificity, precision, F1 score and Shapley values.ResultsOf annotated notes, 55.9% contained evidence for incarceration history by manual annotation. ICD-10 code identification demonstrated accuracy of 46.1%, sensitivity of 4.8%, specificity of 99.1%, precision of 87.1%, and F1 score of 0.09. RoBERTa NLP demonstrated an accuracy of 77.0%, sensitivity of 78.6%, specificity of 73.3%, precision of 80.0%, and F1 score of 0.79. Longformer NLP demonstrated an accuracy of 91.5%, sensitivity of 94.6%, specificity of 87.5%, precision of 90.6%, and F1 score of 0.93.ConclusionThe Longformer-based NLP model was effective in identifying patients’ exposure to incarceration and has potential to help address health disparities by enabling use of electronic health records to study quality of care for this patient population and identify potential areas for improvement.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Identifying Incarceration Status in the Electronic Health Record Using Natural Language Processing in Emergency Department Settings

Huang,

Socrates,

Gilson

et al. 2023

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…To our knowledge, only one study has validated the use of NLP to locate adults with a history of personal incarceration using the Veterans Administration health record. [ 28 ] In their study, the NLP keyword search resulted in an F1-score (a balanced measure of recall and precision) of 0.58; and after integrating NLP and a simplistic machine learning approach, the F-1 score improved to 0.75 [ 28 ]. Our study achieved a similar increase, but our keyword search resulted in an F-1 score of 0.76, and after integrating BERT, the F-1 score improved to 0.93.…”

Section: Discussionmentioning

confidence: 99%

“…To our knowledge, only one study explored the use of natural language processing to locate adults with a history of personal incarceration using the Veterans Administration health record [ 28 ]. No study, to date, has examined the use of natural language processing to locate children of justice-involved parents, absent self-report screening tools, nor has research leveraged advanced machine learning models to enhance model accuracy.…”

Section: Introductionmentioning

confidence: 99%

Locating Youth Exposed to Parental Justice Involvement in the Electronic Health Record: Development of a Natural Language Processing Model

Boch¹,

Hussain²,

Bambach³

et al. 2022

JMIR Pediatr Parent

View full text Add to dashboard Cite

Background Parental justice involvement (eg, prison, jail, parole, or probation) is an unfortunately common and disruptive household adversity for many US youths, disproportionately affecting families of color and rural families. Data on this adversity has not been captured routinely in pediatric health care settings, and if it is, it is not discrete nor able to be readily analyzed for purposes of research. Objective In this study, we outline our process training a state-of-the-art natural language processing model using unstructured clinician notes of one large pediatric health system to identify patients who have experienced a justice-involved parent. Methods Using the electronic health record database of a large Midwestern pediatric hospital-based institution from 2011-2019, we located clinician notes (of any type and written by any type of provider) that were likely to contain such evidence of family justice involvement via a justice-keyword search (eg, prison and jail). To train and validate the model, we used a labeled data set of 7500 clinician notes identifying whether the patient was ever exposed to parental justice involvement. We calculated the precision and recall of the model and compared those rates to the keyword search. Results The development of the machine learning model increased the precision (positive predictive value) of locating children affected by parental justice involvement in the electronic health record from 61% (a simple keyword search) to 92%. Conclusions The use of machine learning may be a feasible approach to addressing the gaps in our understanding of the health and health services of underrepresented youth who encounter childhood adversities not routinely captured—particularly for children of justice-involved parents.

show abstract

“…EHR has been used in hospitals to standardize and integrate medical records written by doctors. Therefore, EHR is an abundant source of health information [14] . Medical texts include various data types, such as electronic medical records and medical and radiological reports.…”

mentioning

confidence: 99%

KTI-RNN: Recognition of Heart Failure from Clinical Notes

et al. 2023

Tsinghua Sci. Technol.

View full text Add to dashboard Cite

Although deep learning methods have recently attracted considerable attention in the medical field, analyzing large-scale electronic health record data is still a difficult task. In particular, the accurate recognition of heart failure is a key technology for doctors to make reasonable treatment decisions. This study uses data from the Medical Information Mart for Intensive Care database. Compared with structured data, unstructured data contain abundant patient information. However, this type of data has unsatisfactory characteristics, e.g., many colloquial vocabularies and sparse content. To solve these problems, we propose the KTI-RNN model for unstructured data recognition. The proposed model overcomes sparse content and obtains good classification results. The term frequency-inverse word frequency (TF-IWF) model is used to extract the keyword set. The latent dirichlet allocation (LDA) model is adopted to extract the topic word set. These models enable the expansion of the medical record text content. Finally, we embed the global attention mechanism and gating mechanism between the bidirectional recurrent neural network (BiRNN) model and the output layer. We call it gated-attention-BiRNN (GA-BiRNN) and use it to identify heart failure from extensive medical texts. Results show that the F 1 score of the proposed KTI-RNN model is 85.57%, and the accuracy rate of the proposed KTI-RNN model is 85.59%.

show abstract

Measuring Exposure to Incarceration Using the Electronic Health Record

Cited by 17 publications

References 24 publications

Identifying Incarceration Status in the Electronic Health Record Using Natural Language Processing in Emergency Department Settings

Identifying Incarceration Status in the Electronic Health Record Using Natural Language Processing in Emergency Department Settings

Locating Youth Exposed to Parental Justice Involvement in the Electronic Health Record: Development of a Natural Language Processing Model

KTI-RNN: Recognition of Heart Failure from Clinical Notes

Contact Info

Product

Resources

About