This study presents EliIE, an OMOP CDM-based information extraction system for automatic structuring and formalization of free-text EC. According to our evaluation, machine learning-based EliIE outperforms existing systems and shows promise to improve.
BackgroundTo demonstrate that subject selection based on sufficient laboratory results and medication orders in electronic health records can be biased towards sick patients.MethodsUsing electronic health record data from 10,000 patients who received anesthetic services at a major metropolitan tertiary care academic medical center, an affiliated hospital for women and children, and an affiliated urban primary care hospital, the correlation between patient health status and counts of days with laboratory results or medication orders, as indicated by the American Society of Anesthesiologists Physical Status Classification (ASA Class), was assessed with a Negative Binomial Regression model.ResultsHigher ASA Class was associated with more points of data: compared to ASA Class 1 patients, ASA Class 4 patients had 5.05 times the number of days with laboratory results and 6.85 times the number of days with medication orders, controlling for age, sex, emergency status, admission type, primary diagnosis, and procedure.ConclusionsImposing data sufficiency requirements for subject selection allows researchers to minimize missing data when reusing electronic health records for research, but introduces a bias towards the selection of sicker patients. We demonstrated the relationship between patient health and quantity of data, which may result in a systematic bias towards the selection of sicker patients for research studies and limit the external validity of research conducted using electronic health record data. Additionally, we discovered other variables (i.e., admission status, age, emergency classification, procedure, and diagnosis) that independently affect data sufficiency.
Objective
Permanent biventricular pacing benefits patients with heart failure and interventricular conduction delay, but the importance of pacing with and without optimization in patients at risk of low cardiac output after heart surgery is unknown. We hypothesized that pacing parameters independently affect cardiac output. Accordingly, we analyzed aortic flow measured with an electromagnetic flowmeter in patients at risk of low cardiac output, during an ongoing randomized clinical trial of biventricular pacing (n=11) vs. standard of care (n=9).
Methods
A sub-study was conducted in all 20 patients, in both groups, with stable pacing after coronary artery bypass grafting and/or valve surgery. Ejection fraction averaged 33±15%, QRS duration 116±19 msec. Effects were measured within one hour of the conclusion of cardiopulmonary bypass. Atrioventricular delay (7 settings) and interventricular delay (9 settings) were optimized in random sequence.
Results
Optimization of atrioventricular delay (171±8 msec), at an interventricular delay of 0 msec, increased flow 14% vs. the worst setting (111±11 msec, p < 0.001) and 7% vs. nominal atrioventricular delay (120 msec, p < 0.001). Interventricular delay optimization increased flow 10% vs. the worst setting (p < 0.001) and 5% vs. nominal interventricular delay (0 msec, p < 0.001). Optimized pacing increased cardiac output 13% vs. atrial pacing at matched heart rate (5.5±0.5 vs. 4.9±0.6 L/min; p = 0.003) and 10% vs. sinus rhythm (5.0±0.6 L/min; p = 0.019).
Conclusions
Temporary biventricular pacing increases intraoperative cardiac output in patients with left ventricular dysfunction undergoing cardiac surgery. Atrioventricular and interventricular delay optimization maximizes this benefit.
Objectives
To automatically identify and cluster clinical trials with similar eligibility features.
Methods
Using the public repository ClinicalTrials.gov as the data source, we extracted semantic features from the eligibility criteria text of all clinical trials and constructed a trial-feature matrix. We calculated the pairwise similarities for all clinical trials based on their eligibility features. For all trials, by selecting one trial as the center each time, we identified trials whose similarities to the central trial were greater than or equal to a predefined threshold and constructed center-based clusters. Then we identified unique trial sets with distinctive trial membership compositions from center-based clusters by disregarding their structural information.
Results
From the 145,745 clinical trials on ClinicalTrials.gov, we extracted 5,508,491 semantic features. Of these, 459,936 were unique and 160,951 were shared by at least one pair of trials. Crowdsourcing the cluster evaluation using Amazon Mechanical Turk (MTurk), we identified the optimal similarity threshold, 0.9. Using this threshold, we generated 8,806 center-based clusters. Evaluation of a sample of the clusters by MTurk resulted in a mean score 4.331±0.796 on a scale of 1–5 (5 indicating “strongly agree that the trials in the cluster are similar”).
Conclusions
We contribute an automated approach to clustering clinical trials with similar eligibility features. This approach can be potentially useful for investigating knowledge reuse patterns in clinical trial eligibility criteria designs and for improving clinical trial recruitment. We also contribute an effective crowdsourcing method for evaluating informatics interventions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.