When doing research on scientific issues, it is very significant if our research issues are closely connected to real applications. In reality, when analyzing data in practice, there are frequently several models that can appropriate to the survey data. Hence, it is necessary to have a standard criterion to choose the most ecient model. In this article, our primary interest is to compare and discuss about the criteria for selecting a model and its applications. The authors provide approaches and procedures of these methods and apply to the traffic violation data where we look for the most appropriate model among Poisson regression, Zero-inflated Poisson regression and Negative binomial regression to capture between number of violated speed regulations and some factors including distance covered, motorcycle engine and age of respondents by using AIC, BIC and Vuong's test. Based on results on the training, validation and test data set, we find that the criteria AIC and BIC are more consistent and robust performance in model selection than the Vuong's test. In the present paper, the authors also discuss about advantages and disadvantages of these methods and provide some of the suggestions with potential directions in future research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium provided the original work is properly cited.
Abstract:The literature of count regression models covers a large scope of studies and applications that implemented simple and standard models for count response variables by using Poisson regression models, binomial regression models, negative binomial regression models, geometric regression models, or generalized Poisson regression models. These regression models have received considerable attention in various situations. Nevertheless in many fields, the distribution of the count response variable may display a feature of excess zeros for which the aforementioned regression models may fail to provide an adequate fit. To remedy this handicap, a class of distributions known as zero-inflated models is considered as the most appropriate approach for dealing properly with this issue of excess zeros. In addition to the zero-inflated problem, it happens quite often that the sample data sets under investigation are not completely observed. This refers to the missing data problem. In this study, our primary interest is in reviewing studies that considered simultaneously the missing data problem and the zero-inflated feature in modeling zero-inflated data. Moreover, we discuss their methodologies and results and some potential directions of the future research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.