How Clinical Practice Research Datalink data are used to support pharmacovigilance

Ghosh, Rebecca; Crellin, Elizabeth; Beatty, Sue; Donegan, Katherine; Myles, Puja; Williams, Rachael

doi:10.1177/2042098619854010

Cited by 34 publications

(30 citation statements)

References 27 publications

(37 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Routinely collated primary care data from electronic health records (EHRs) are the most frequently used data source for research into medical prescribing practices and pharmacovigilance and pharmacoepidemiological studies [1]. The use of EHRs in the United Kingdom (UK) was initiated in primary care, with 96% of all general practices using them by 1996 [2].…”

Section: Introductionmentioning

confidence: 99%

Use of Primary Care Data in Research and Pharmacovigilance: Eight Scenarios Where Prescription Data are Absent

et al. 2021

Self Cite

View full text Add to dashboard Cite

The use of primary care databases has been integral in pharmacoepidemiological studies and pharmacovigilance. Primary care databases derive from electronic health records and offer a comprehensive description of aggregate patient data, from demography to medication history, and good sample sizes. Studies using these databases improve our understanding of prescribing characteristics and associated risk factors to facilitate better patient care, but there are limitations. We describe eight key scenarios where study data outcomes can be affected by absent prescriptions in UK primary care databases: (1) out-of-hours, urgent care and acute care prescriptions; (2) specialist-only prescriptions; (3) alternative community prescribing, such as pharmacy, family planning clinic or sexual health clinic medication prescriptions; (4) newly licensed medication prescriptions; (5) medications that do not require prescriptions; (6) hospital inpatient and outpatient prescriptions; (7) handwritten prescriptions; and (8) private pharmacy and private doctor prescriptions. The significance of each scenario is dependent on the type of medication under investigation, nature of the study and expected outcome measures. We recommend that all researchers using primary care databases be aware of the potential for missing prescribing data and be sensitive to how this can vary substantially between items, drug classes, patient groups and over time. Close liaison with practising primary care clinicians in the UK is often essential to ensure awareness of nuances in clinical practice.

show abstract

Section: Introductionmentioning

confidence: 99%

Use of Primary Care Data in Research and Pharmacovigilance: Eight Scenarios Where Prescription Data are Absent

et al. 2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…Unlike traditional health research datasets, these routinely collected clinical data offer the opportunity to augment conventional health variables with multiple administrative and social variables (referrals, social care needs, etc), and with longitudinal patterns, such as changes in a patient’s symptoms or medications over time, with high external validity to the real world. These records are frequently used by researchers for epidemiological studies or for monitoring post-marketing drug safety [11].…”

Section: Introductionmentioning

confidence: 99%

Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches

Ford

Rooney

Oliver

et al. 2019

BMC Med Inform Decis Mak

View full text Add to dashboard Cite

BackgroundIdentifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP.MethodsWe used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination.ResultsThe final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing.ConclusionsOur model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time.

show abstract

“…Many important epidemiological studies have been conducted using these population-based, routinely collected data. For example, the safety of the measles, mumps, and rubella vaccine has been studied (4), and the impact on pregnancy complications of legislative changes to make public spaces smoke free (5), among many studies on the safety of drugs in population usage (6).…”

Section: Uses Of Electronic Health Record Data For Epidemiologymentioning

confidence: 99%

Can the Use of Bayesian Analysis Methods Correct for Incompleteness in Electronic Health Records Diagnosis Data? Development of a Novel Method Using Simulated and Real-Life Clinical Data

Ford

Rooney

Hurley

et al. 2020

Front. Public Health

View full text Add to dashboard Cite

Background: Patient health information is collected routinely in electronic health records (EHRs) and used for research purposes, however, many health conditions are known to be under-diagnosed or under-recorded in EHRs. In research, missing diagnoses result in under-ascertainment of true cases, which attenuates estimated associations between variables and results in a bias toward the null. Bayesian approaches allow the specification of prior information to the model, such as the likely rates of missingness in the data. This paper describes a Bayesian analysis approach which aimed to reduce attenuation of associations in EHR studies focussed on conditions characterized by under-diagnosis. Methods: Study 1: We created synthetic data, produced to mimic structured EHR data where diagnoses were under-recorded. We fitted logistic regression (LR) models with and without Bayesian priors representing rates of misclassification in the data. We examined the LR parameters estimated by models with and without priors. Study 2: We used EHR data from UK primary care in a case-control design with dementia as the outcome. We fitted LR models examining risk factors for dementia, with and without generic prior information on misclassification rates. We examined LR parameters estimated by models with and without the priors, and estimated classification accuracy using Area Under the Receiver Operating Characteristic. Results: Study 1: In synthetic data, estimates of LR parameters were much closer to the true parameter values when Bayesian priors were added to the model; with no priors, parameters were substantially attenuated by under-diagnosis. Study 2: The Bayesian approach ran well on real life clinic data from UK primary care, with the addition of prior information increasing LR parameter values in all cases. In multivariate regression models, Bayesian methods showed no improvement in classification accuracy over traditional LR. Conclusions: The Bayesian approach showed promise but had implementation challenges in real clinical data: prior information on rates of misclassification was difficult Ford et al. Bayesian Analysis for EHR Data to find. Our simple model made a number of assumptions, such as diagnoses being missing at random. Further development is needed to integrate the method into studies using real-life EHR data. Our findings nevertheless highlight the importance of developing methods to address missing diagnoses in EHR data.

show abstract

How Clinical Practice Research Datalink data are used to support pharmacovigilance

Cited by 34 publications

References 27 publications

Use of Primary Care Data in Research and Pharmacovigilance: Eight Scenarios Where Prescription Data are Absent

Use of Primary Care Data in Research and Pharmacovigilance: Eight Scenarios Where Prescription Data are Absent

Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches

Can the Use of Bayesian Analysis Methods Correct for Incompleteness in Electronic Health Records Diagnosis Data? Development of a Novel Method Using Simulated and Real-Life Clinical Data

Contact Info

Product

Resources

About