Purpose
X-linked hypophosphatemia (XLH) is a rare multi-systemic disease characterized by low plasma phosphate levels. The aim of this study was to investigate the annual XLH prevalence and internally evaluate predictive algorithms’ application performance for the early diagnosis of XLH.
Methods
The PediaNet database, containing data on more than 400,000 children aged up to 14 years, was used to identify a cohort of XLH patients, which were matched with up to 10 controls by date of birth and gender. The annual prevalence of XLH cases per 100,000 patients registered in PediaNet database was estimated. To identify possible predictors associated with XLH diagnosis, a logistic regression model and two machine learning algorithms were applied. Predictive analyses were separately carried out including patients with at least 1 or 2 years of database history in PediaNet.
Results
Among 431,021 patients registered in the PediaNet database between 2007–2020, a total of 12 cases were identified with a mean annual prevalence of 1.78 cases per 100,000 patients registered in PediaNet database. Overall, 8 cases and 60 matched controls were included in the analysis. The random forest algorithm achieved the highest area under the receiver operating characteristic curve (AUC) value both in the one-year prior ID (AUC = 0.99, 95% CI = 0.99–1.00) and the two-year prior ID (AUC = 1.00, 95% CI = 1.00–1.00) analysis. Overall, the XLH predictors selected by the three predictive methods were: the number of vitamin D prescriptions, the number of recorded diagnoses of acute respiratory infections, the number of prescriptions of antihistamine for systemic use, the number of prescriptions of X-ray of the lower limbs and pelvis and the number of allergology visits.
Conclusion
Findings showed that data-driven machine learning models may play a prominent role for the prediction of the diagnosis of rare diseases such as XLH.