Background: Early childhood dental care (ECDC) is a significant public health opportunity since dental caries is largely preventable and a prime target for reducing healthcare expenditures. This study aims to discover underlying patterns in ECDC utilization among Ohio Medicaid-insured children, which have significant implications for public health prevention, innovative service delivery models, and targeted cost-saving interventions.Methods: Using 9 years of longitudinal Medicaid data of 24,223 publicly insured child members of an accountable care organization (ACO), Partners for Kids in Ohio, we applied unsupervised machine learning to cluster patients based on their cumulative dental cost curves in early childhood (24–60 months). Clinical validity, analytical validity, and reproducibility were assessed.Results: The clustering revealed five novel subpopulations: (1) early-onset of decay by age (0.5% of the population, as early as 28 months), (2) middle-onset of decay (3.0%, as early as 35 months), (3) late-onset of decay (5.8%, as early as 44 months), (4) regular preventive care (67.7%), and (5) zero utilization (23.0%). Patients with early-onset of decay incurred the highest dental cost [median annual cost (MAC) = $9,499, InterQuartile Range (IQR): $7,052–$11,216], while patients with regular preventive care incurred the lowest dental cost (MAC = $191, IQR: $99–$336). We also found a plausible correlation of early-onset of decay with complex medical conditions diagnosed at 0–24 months. Almost one-third of patients with early-onset of decay had complex medical conditions diagnosed at 0–24 months. Patients with early-onset of decay also incurred the highest medical cost (MAC = $7,513, IQR: $4,527–$12,546) at 0–24 months.Conclusion: Among Ohio Medicaid-insured children, five subpopulations with distinctive clinical, cost, and utilization patterns were discovered and validated through a data-driven approach. This novel discovery promotes innovative prevention strategies that differentiate Medicaid subpopulations, and allows for the development of cost-effective interventions that target high-risk patients. Furthermore, an integrated medical-dental care delivery model promises to reduce costs further while improving patient outcomes.
Accurately predicting patient expenditure in healthcare is an important task with many applications such as provider profiling, accountable care management, and capitated medical payment adjustment. Existing approaches mainly rely on manually designed features and linear regression-based models, which require massive medical domain knowledge and show limited predictive performance. This paper proposes a multi-view deep learning framework to predict future healthcare expenditure at the individual level based on historical claims data. Our multi-view approach can effectively model the heterogeneous information, including patient demographic features, medical codes, drug usages, and facility utilization. We conducted expenditure forecasting tasks on a real-world pediatric dataset that contains more than 450,000 patients. The empirical results show that our proposed method outperforms all baselines for predicting medical expenditure. These findings help toward better preventive care and accountable care in the healthcare domain.INDEX TERMS Administrative claims data, deep learning, electronic health record, expenditure prediction, machine learning.
The adoption of electronic health records (EHR) has become universal during the past decade, which has afforded in-depth data-based research. By learning from the large amount of healthcare data, various data-driven models have been built to predict future events for different medical tasks, such as auto diagnosis and heart-attack prediction. Although EHR is abundant, the population that satisfies specific criteria for learning population-specific tasks is scarce, making it challenging to train data-hungry deep learning models. This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset, followed by a discriminative fine-tuning on each population-specific task. The semantic meaning of medical events can be captured in the pre-training stage, and the effective knowledge transfer is completed through the task-aware fine-tuning stage. The fine-tuning process requires minimal parameter modification without changing the model architecture, which mitigates the data scarcity issue and helps train the deep learning model adequately on small patient cohorts. We conducted experiments on a real-world pediatric dataset with more than one million patient records. Experimental results on two downstream tasks demonstrated the effectiveness of our method: our general task-agnostic pre-training framework outperformed tailored task-specific models, achieving more than 10% higher in model performance as compared to baselines. In addition, our framework showed a potential to transfer learned knowledge from one institution to another, which may pave the way for future healthcare model pre-training across institutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.