Background. There is growing evidence that social and behavioral determinants of health (SBDH) play a substantial effect in a wide range of health outcomes. Electronic health records (EHRs) have been widely employed to conduct observational studies in the age of artificial intelligence (AI). However, there has been limited review into how to make the most of SBDH information from EHRs using AI approaches. Methods. A systematic search was conducted in six databases to find relevant peer-reviewed publications that had recently been published. Relevance was determined by screening and evaluating the articles. Based on selected relevant studies, a methodological analysis of AI algorithms leveraging SBDH information in EHR data was provided. Results. Our synthesis was driven by an analysis of SBDH categories, the relationship between SBDH and healthcare-related statuses, natural language processing (NLP) approaches for extracting SBDH from clinical notes, and predictive models using SBDH for health outcomes. Discussion. The associations between SBDH and health outcomes are complicated and diverse; several pathways may be involved. Using NLP technology to support the extraction of SBDH and other clinical ideas simplifies the identification and extraction of essential concepts from clinical data, efficiently unlocks unstructured data, and aids in the resolution of unstructured data-related issues. Conclusion. Despite known associations between SBDH and diseases, SBDH factors are rarely investigated as interventions to improve patient outcomes. Gaining knowledge about SBDH and how SBDH data can be collected from EHRs using NLP approaches and predictive models improves the chances of influencing health policy change for patient wellness, ultimately promoting health and health equity.
Background There are significant variabilities in guideline-concordant documentation in asthma care. However, assessing clinician’s documentation is not feasible using only structured data but requires labor-intensive chart review of electronic health records (EHRs). A certain guideline element in asthma control factors, such as review inhaler techniques, requires context understanding to correctly capture from EHR free text. Methods The study data consist of two sets: (1) manual chart reviewed data—1039 clinical notes of 300 patients with asthma diagnosis, and (2) weakly labeled data (distant supervision)—27,363 clinical notes from 800 patients with asthma diagnosis. A context-aware language model, Bidirectional Encoder Representations from Transformers (BERT) was developed to identify inhaler techniques in EHR free text. Both original BERT and clinical BioBERT (cBERT) were applied with a cost-sensitivity to deal with imbalanced data. The distant supervision using weak labels by rules was also incorporated to augment the training set and alleviate a costly manual labeling process in the development of a deep learning algorithm. A hybrid approach using post-hoc rules was also explored to fix BERT model errors. The performance of BERT with/without distant supervision, hybrid, and rule-based models were compared in precision, recall, F-score, and accuracy. Results The BERT models on the original data performed similar to a rule-based model in F1-score (0.837, 0.845, and 0.838 for rules, BERT, and cBERT, respectively). The BERT models with distant supervision produced higher performance (0.853 and 0.880 for BERT and cBERT, respectively) than without distant supervision and a rule-based model. The hybrid models performed best in F1-score of 0.877 and 0.904 over the distant supervision on BERT and cBERT. Conclusions The proposed BERT models with distant supervision demonstrated its capability to identify inhaler techniques in EHR free text, and outperformed both the rule-based model and BERT models trained on the original data. With a distant supervision approach, we may alleviate costly manual chart review to generate the large training data required in most deep learning-based models. A hybrid model was able to fix BERT model errors and further improve the performance.
Mobile infrastructure in low-and middle-income countries (LMIC) has shown immense potential to reach the unreachable. Healthcare providers (HCP) are one such group who are at the frontline of the fight against infant mortality in LMICs. Mortality among newborn infants (birth to 28 days) now accounts for around 45% of all under 5years child mortality. Birth asphyxia is one of the three leading causes of newborn death; neonatal resuscitation training, among health care providers, reduces mortality from birth asphyxia. We have developed a mobile phone-based training app, called mobile Helping Babies Survive (mHBS), to support the training of health care providers on neonatal resuscitation. mHBS is integrated with the District Health Information System (DHIS2) platform, which is used in over 60 countries around the world. The mHBS/DHIS2 training app is a part of an application suite which includes another DHIS2-linked data collection app, mHBS tracker. The mHBS training application has the potential to scale-up integration with other neonatal training apps. Ultimately, the mHBS training suite will provide new insights into healthcare worker education along with the necessary tools for effective care of newborn babies.
Background Electronic health records (EHRs) are a rich source of longitudinal patient data. However, missing information due to clinical care that predated the implementation of EHR system(s) or care that occurred at different medical institutions impedes complete ascertainment of a patient’s medical history. Objective This study aimed to investigate information discrepancies and to quantify information gaps by comparing the gynecological surgical history extracted from an EHR of a single institution by using natural language processing (NLP) techniques with the manually curated surgical history information through chart review of records from multiple independent regional health care institutions. Methods To facilitate high-throughput evaluation, we developed a rule-based NLP algorithm to detect gynecological surgery history from the unstructured narrative of the Mayo Clinic EHR. These results were compared to a gold standard cohort of 3870 women with gynecological surgery status adjudicated using the Rochester Epidemiology Project medical records–linkage system. We quantified and characterized the information gaps observed that led to misclassification of the surgical status. Results The NLP algorithm achieved precision of 0.85, recall of 0.82, and F1-score of 0.83 in the test set (n=265) relative to outcomes abstracted from the Mayo EHR. This performance attenuated when directly compared to the gold standard (precision 0.79, recall 0.76, and F1-score 0.76), with the majority of misclassifications being false negatives in nature. We then applied the algorithm to the remaining patients (n=3340) and identified 2 types of information gaps through error analysis. First, 6% (199/3340) of women in this study had no recorded surgery information or partial information in the EHR. Second, 4.3% (144/3340) of women had inconsistent or inaccurate information within the clinical narrative owing to misinterpreted information, erroneous “copy and paste,” or incorrect information provided by patients. Additionally, the NLP algorithm misclassified the surgery status of 3.6% (121/3340) of women. Conclusions Although NLP techniques were able to adequately recreate the gynecologic surgical status from the clinical narrative, missing or inaccurately reported and recorded information resulted in much of the misclassification observed. Therefore, alternative approaches to collect or curate surgical history are needed.
IntroductionRacially and ethnically diverse minorities often experience the disease burden of sexually transmitted infections or diseases (STD) more often than their White counterparts. Yet, little is known about the connection of STD systematic discrimination, racism, and social and behavioral determinants. Plus, little to no details exists related to how this information is recorded in their Electronic Health Records (EHRs). The objective of this study is to assess the completeness of social and behavioral determinants of health (SDOH) data in the EHRs of a minority cohort with STD.Materials and Methods2,993 minority patients diagnosed with a STD at the Mayo Clinic were identified for this study. A natural language processing (NLP) algorithm was applied on the Patient-Provided Information (PPI) in their EHRs to extract SDOH information in six domains that are associated with STD, namely alcohol use, substance use, sexual activity, sexual orientation, housing status, and employment status. The completeness of SDOH was assessed in terms of documentation, breadth, and density.ResultsOur study indicates that nearly half of 2,993 patients did not have SDOH-related PPI records in their EHRs whereas the patients who had SDOH-related PPI records had well-documented records for five out of six SDOH domains, including alcohol use, substance use, sexual activity, housing status, and employment status, except for sexual orientation. A total of 1,504 patients had PPI in their EHRs for at least one of six SDOH domains, which is about 50.3% of the study cohort. Most SDOH domains have a short time span of 1 year, with up to 18 years of record data. Our analysis also indicated that education and age have a significant impact on the recording of SDOH-related PPI records. Patients that are female, older, and higher educated tend to have more SDOH information available in their records.Discussion and ConclusionWe assessed the completeness of SDOH information recorded in the PPI from patients’ EHRs. Due to large amounts of missing SDOH information in the PPI, future research is needed to integrate accurate and robust SDOH related data for downstream research and the impact of systematic discrimination on how this information is collected and interpreted in the EHRs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.