Background:
Healthcare claims databases can provide information on the effects of type 2 diabetes (T2DM) medications as used in
routine care, but often do not contain data on important clinical characteristics, which may be captured in electronic health
records (EHR).
Objectives:
To evaluate the extent to which balance in unmeasured patient characteristics was achieved in claims data, by comparing
against more detailed information from linked EHR data.
Methods:
Within a large US commercial insurance database and using a cohort design, we identified T2DM patients initiating
linagliptin or a comparator agent within class (i.e., other DPP-4 inhibitors) or outside class (i.e., (pioglitazone or
sulfonylureas) between 05/2011-12/2012. We focused on comparators used at a similar stage of diabetes as linagliptin. For each
comparison, 1:1 propensity score (PS) matching was used to balance over 100 baseline claims-based characteristics, including
proxies of diabetes severity and duration. Additional clinical data from EHRs was available for a subset of patients. We
assessed representativeness of the claims-EHR linked subset, evaluated the balance of claims- and EHR-based covariates before
and after PS-matching via standardized differences (SD), and quantified the potential bias associated with observed
imbalances.
Results:
From a claims-based study population of 166,613 T2DM patients, 7,219 (4.3%) patients were linked to their EHR data.
Claims-based characteristics between the EHR-linked and EHR-unlinked patients were comparable (SD<0.1), confirming
representativeness of the EHR-linked subset. The balance of claims-based and EHR-based patient characteristics appeared to be
reasonable before PS-matching and generally improved in the PS-matched population, to be SD<0.1 for most patient
characteristics and SD<0.2 for select laboratory results and BMI categories, not large enough to cause meaningful
confounding.
Conclusion:
In the context of pharmacoepidemiologic research on diabetes therapy, choosing appropriate comparison groups paired
with a new user design and 1:1 PS matching on many proxies of diabetes severity and duration improves balance in covariates
typically unmeasured in administrative claims datasets, to an extent that residual confounding is unlikely.