2015
DOI: 10.1002/bimj.201400143
|View full text |Cite
|
Sign up to set email alerts
|

Variable selection for zero‐inflated and overdispersed data with application to health care demand in Germany

Abstract: In health services and outcome research, count outcomes are frequently encountered and often have a large proportion of zeros. The zero-inflated negative binomial (ZINB) regression model has important applications for this type of data. With many possible candidate risk factors, this paper proposes new variable selection methods for the ZINB model. We consider maximum likelihood function plus a penalty including the least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (S… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
54
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(55 citation statements)
references
References 23 publications
1
54
0
Order By: Relevance
“…Consequently, we used either a negative binomial regression (Experiment 2) or a zero‐inflated negative binomial generalized linear model (ZINB; Experiment 3), while accounting for zero‐inflated data, to assess the number of trapped D. suzukii and pollinators by collection period, trap height, collection period × trap height, and month of collection . Because both models can account for overdispersion, we initially completed analyses using both models, and subsequently selected the model with the lowest Akaike information criterion (AIC) score . Interval duration (Experiment 2: 2, 3, 8 h; Experiment 3: 2, 3 7 h) data were used as the offset and the period with the highest trap catches was used as the reference variable.…”
Section: Methodssupporting
confidence: 65%
“…Consequently, we used either a negative binomial regression (Experiment 2) or a zero‐inflated negative binomial generalized linear model (ZINB; Experiment 3), while accounting for zero‐inflated data, to assess the number of trapped D. suzukii and pollinators by collection period, trap height, collection period × trap height, and month of collection . Because both models can account for overdispersion, we initially completed analyses using both models, and subsequently selected the model with the lowest Akaike information criterion (AIC) score . Interval duration (Experiment 2: 2, 3, 8 h; Experiment 3: 2, 3 7 h) data were used as the offset and the period with the highest trap catches was used as the reference variable.…”
Section: Methodssupporting
confidence: 65%
“…For the estimation of the model to predict future citations and to select variables we exclusively used the training cohort ( N = 259), whilst the test cohort ( N = 143) remained strictly reserved for model validation. All analyses were conducted using the statistical software r and extensions .…”
Section: Methodsmentioning
confidence: 99%
“…Chun and Griffith (2013) list R code for stepwise selection in GLMs based on SAC minimization. In the mpath package (Wang et al 2015), the be.zeroinfl function performs a backward elimination (and forward selection) based on maximum likelihood criteria, and can be applied to zero-inflated models. Further variable selection algorithms for zero-inflated count data are presented in the medical literature, all proposing LASSO-based approaches (Chen et al 2016;Zeng et al 2014;Buu et al 2011) with different types of penalizations.…”
Section: A Backward Stepwise Algorithmmentioning
confidence: 99%