2021
DOI: 10.1186/s12874-021-01416-5
|View full text |Cite
|
Sign up to set email alerts
|

A narrative review on the validity of electronic health record-based research in epidemiology

Abstract: Electronic health records (EHRs) are widely used in epidemiological research, but the validity of the results is dependent upon the assumptions made about the healthcare system, the patient, and the provider. In this review, we identify four overarching challenges in using EHR-based data for epidemiological analysis, with a particular emphasis on threats to validity. These challenges include representativeness of the EHR to a target population, the availability and interpretability of clinical and non-clinical… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
52
1
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 79 publications
(54 citation statements)
references
References 72 publications
0
52
1
1
Order By: Relevance
“…Despite high specificity for accurate diagnosis of a disease, ICD codes are known to have low sensitivity; in other words the presence of a code is a likely indicator of a disease, however, the absence of a code does not reliably indicate absence of that disease. 43 The ICD-9 codes for diabetes with complications (250.1–250.9), for example, have a sensitivity of 63.6% and specificity of 99%. 44 Therefore, since a diagnosis is not captured with high probability, those with more medical encounters are more likely to have the presence of diabetes with complication detected.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite high specificity for accurate diagnosis of a disease, ICD codes are known to have low sensitivity; in other words the presence of a code is a likely indicator of a disease, however, the absence of a code does not reliably indicate absence of that disease. 43 The ICD-9 codes for diabetes with complications (250.1–250.9), for example, have a sensitivity of 63.6% and specificity of 99%. 44 Therefore, since a diagnosis is not captured with high probability, those with more medical encounters are more likely to have the presence of diabetes with complication detected.…”
Section: Discussionmentioning
confidence: 99%
“…Controlling for the number of inpatient encounters may have been a potential solution to remove the informed decision bias (ie, ≥ n visits to be eligible into the cohort), however, this may incur a selection bias as individuals with fewer visits would be excluded. 43 , 45 Diagnosis data have acceptable quality due to mandates requiring accurate collection of this data, and certain demographic data (ie, age, gender, and ethnicity/race) are mandated by the Meaningful Use objectives. 64 However, there is no mandated coding system for the remaining nonessential demographic (eg, income, marital status, education), laboratory, vital sign, or social data for EHRs.…”
Section: Discussionmentioning
confidence: 99%
“…Epidemiological studies using electronic health records (EHRs) data still generally have several weaknesses and overarching challenges, not only in the NHO database: validity of data, representativeness, data availability and interpretation, and missingness. 31–33 For example, as noted in the previous section, hospital EHRs do not contain information on medical care provided outside the hospital. As a solution to this problem, based on the “Next Generation Medical Infrastructure Law” (official name “Act on Anonymized Medical Data That Are Meant to Contribute to Research and Development in the Medical Field”) which was legislated in 2018, the project has been initiated to expand the capability of the data resources by the linkage of the EHR data from the NHO database and the clinics across the country held by the Japan Medical Association.…”
Section: Strengths and Weaknesses Of Nho Databasementioning
confidence: 99%
“…Many research protocols for access and ethics committees are not yet specifically addressing this area, despite significant attention in terms of implementation (32-34). Importantly, EHR data itself can be inherently biased, for example from the data collection process mandated by the software or if the primary use of the data is for administrative or billing purposes (6).…”
Section: Access and Ethical Approvalmentioning
confidence: 99%
“…Researchers have extensive experience of producing high quality research from patient data, and we have worked with approval bodies which have adapted protocol guidelines to support this work. However, EHRs are different to many other sources of patient data; they are neither an opportunistic collection of existing administrative data sources nor a purposefully designed comprehensive single database (registry) (6). Rather, they collate information on the patient's clinical condition, laboratory results, diagnoses and treatments as they are experiencing health care.…”
Section: Introductionmentioning
confidence: 99%