2020
DOI: 10.1515/spp-2019-0010
|View full text |Cite
|
Sign up to set email alerts
|

Healthcare Expenditure Prediction with Neighbourhood Variables – A Random Forest Model

Abstract: We investigated the additional predictive value of an individual’s neighbourhood (quality and location), and of changes therein on his/her healthcare costs. To this end, we combined several Dutch nationwide data sources from 2003 to 2014, and selected inhabitants who moved in 2010. We used random forest models to predict the area under the curve of the regular healthcare costs of individuals in the years 2011–2014. In our analyses, the quality of the neighbourhood before the move appeared to be quite important… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 60 publications
0
4
0
Order By: Relevance
“…Among the chief advantages of using ML models is their learning capacities, enabling them to capture much more complex patterns, sometimes even ascending into semantic and abstract levels, albeit requiring substantially more data points in exchange. In the particular case of decision trees, random forests and gradient boosting machines, collinearity is not a problem, which means no potentially predictive information has to be discarded, and missing values do not require any form of filling [ 58 , 59 ]. However, there is also an increased risk of identifying spurious (non-significant) associations, mainly due to issues of overfitting [ 60 ].…”
Section: Discussionmentioning
confidence: 99%
“…Among the chief advantages of using ML models is their learning capacities, enabling them to capture much more complex patterns, sometimes even ascending into semantic and abstract levels, albeit requiring substantially more data points in exchange. In the particular case of decision trees, random forests and gradient boosting machines, collinearity is not a problem, which means no potentially predictive information has to be discarded, and missing values do not require any form of filling [ 58 , 59 ]. However, there is also an increased risk of identifying spurious (non-significant) associations, mainly due to issues of overfitting [ 60 ].…”
Section: Discussionmentioning
confidence: 99%
“…In recent work, Iwendi et al [93] used an RF model with the AdaBoost algorithm to the severity of Covid-19, death, or recovery rate for a patient. Due to their simple and self-explanatory structure, these models are also implemented for predicting the depression of Alzheimer's patients [91], healthcare monitoring systems [47], prediction of medical expenditures [98].…”
Section: Machine Learningmentioning
confidence: 99%
“…This approach has been used extensively to construct social exclusion measures within the literature, where for example Dell'Anno and Amendola (2015) used PCA to determine the weights and component loadings in an exclusion index for 28 European countries. However, the derivation of weights through PCA is considered to lack transparency (Decancq and Lugo 2013) as it may assign lower weights to a crucial indicator simply because it is weakly correlated with other indicators (Best et al 2020;Mohnen et al 2020). The process is also considered arbitrary as the decision to retain components is largely discretionary (OECD 2008).…”
Section: Social Exclusion Measures and Weighting And Aggregationmentioning
confidence: 99%