PURPOSE OF REVIEW:
Use of the electronic health record (EHR) for CVD surveillance is increasingly common. However, these data can introduce systematic error that influences the internal and external validity of study findings. We reviewed recent literature on EHR-based studies of CVD risk to summarize the most common types of bias that arise. Subsequently, we recommend strategies informed by work from others as well as our own to reduce the impact of these biases in future research.
RECENT FINDINGS:
Systematic error, or bias, is a concern in all observational research including EHR-based studies of CVD risk surveillance. Patients captured in an EHR system may not be representative of the general population, due to issues such as informed presence bias, perceptions about the healthcare system that influence entry, and access to health services. Further, the EHR may contain inaccurate information or be missing key data points of interest due to loss to follow-up or over-diagnosis bias. Several strategies, including implementation of unique patient identifiers, adoption of standardized rules for inclusion/exclusion criteria, statistical procedures for data harmonization and analysis, and incorporation of patient-reported data have been used to reduce the impact of these biases.
SUMMARY:
EHR data provide an opportunity to monitor and characterize CVD risk in populations. However, understanding the biases that arise from EHR datasets is instrumental in planning epidemiological studies and interpreting study findings. Strategies to reduce the impact of bias in the context of EHR data can increase the quality and utility of these data.