Purpose: Routinely collected data are useful for epidemiological study in hemophilia, but few studies validated the algorithm accuracy. We aimed to develop and validate algorithms to identify patients with hemophilia A and hemophilia A-related events. Patients and Methods: This validation study compared data from medical chart reviews to a database of routinely collected health data, including claims data and discharge abstracts, and especially electronic medical records (EMR), at a single Japanese hospital (Kurashiki Central Hospital) using a stratified sampling method. Two physicians reviewed the charts for all patients at high risk for hemophilia A, and randomly sampled patients with moderate risk. Diagnostic accuracy was determined based on sensitivity, specificity, positive predictive value (PPV), and negative predictive value. Results: There were 1,033,845 eligible patients, of whom 31 had a diagnosis of hemophilia A. ICD-10 diagnosis code D66 in the EMR identified hemophilia A with a sensitivity of 93.5% (95% confidence interval: 78.6-99) and PPV of 61.7% (95% confidence interval: 46.4-75.5). The administration of ≥10,000 units/month of factor VIII products, as documented in the EMR, identified 81.3% of patients with prophylactic factor replacement therapy. The ICD-10 diagnosis code for intracranial bleeding in the EMR identified 75.0% of patients with intracranial bleeding, but those of gastrointestinal bleeding and major joint bleeding identified only 11.1% and 1.7%, respectively.
Conclusion:We developed and validated algorithms to identify congenital hemophilia A and hemophilia A-related events. Hemophilia A could be identified with high sensitivity and PPV, but it was still challenging to identify hemophilia A-related events.