In the analysis of count data often the equidispersion assumption is not suitable, hence the Poisson regression model is inappropriate. As a generalization of the Poisson distribution the COM-Poisson distribution can deal with under-, equi-and overdispersed count data. It is a member of the exponential family of distributions and has the Poisson and geometric distributions as special cases, as well as the Bernoulli distribution as a limiting case. In spite of the nice properties of the COM-Poisson distribution, its location parameter does not correspond to the expectation, which complicates the interpretation of regression models specified using this distribution. In this paper, we propose a straightforward reparametrization of the COM-Poisson distribution based on an approximation to the expectation of this distribution. The main advantage of our new parametrization is the straightforward interpretation of the regression coefficients in terms of the expectation of the count response variable, as usual in the context of generalized linear models. Furthermore, the estimation and inference for the new COM-Poisson regression model can be done based on the likelihood paradigm. We carried out simulation studies to verify the finite sample properties of the maximum likelihood estimators. The results from our simulation study show that the maximum likeli-hood estimators are unbiased and consistent for both regression and dispersion parameters. We observed that the empirical correlation between the regression and dispersion parameter estimators is close to zero, which suggests that these parameters are orthogonal. We illustrate the application of the proposed model through the analysis of three data sets with over-, under-and equidispersed count data. The study of distribution properties through a consideration of dispersion, zero-inflated and heavy tail indices, together with the results of data analysis show the flexibility over standard approaches. Therefore, we encourage the application of the new parametrization for the analysis of count data in the context of COM-Poisson regression models. The com-arXiv:1801.09795v1 [stat.AP] 29 Jan 2018 2 Ribeiro Jr et al. putational routines for fitting the original and new version of the COM-Poisson regression model and the analyzed data sets are available in the supplementary material.
A Weibull-model-based approach is examined to handle under- and overdispersed discrete data in a hierarchical framework. This methodology was first introduced by Nakagawa and Osaki (1975, IEEE Transactions on Reliability, 24, 300–301), and later examined for under- and overdispersion by Klakattawi et al. (2018, Entropy, 20, 142) in the univariate case. Extensions to hierarchical approaches with under- and overdispersion were left unnoted, even though they can be obtained in a simple manner. This is of particular interest when analysing clustered/longitudinal data structures, where the underlying correlation structure is often more complex compared to cross-sectional studies. In this article, a random-effects extension of the Weibull-count model is proposed and applied to two motivating case studies, originating from the clinical and sociological research fields. A goodness-of-fit evaluation of the model is provided through a comparison of some well-known count models, that is, the negative binomial, Conway–Maxwell–Poisson and double Poisson models. Empirical results show that the proposed extension flexibly fits the data, more specifically, for heavy-tailed, zero-inflated, overdispersed and correlated count data. Discrete left-skewed time-to-event data structures are also flexibly modelled using the approach, with the ability to derive direct interpretations on the median scale, provided the complementary log–log link is used. Finally, a large simulated set of data is created to examine other characteristics such as computational ease and orthogonality properties of the model, with the conclusion that the approach behaves best for highly overdispersed cases.
In agronomic experiments, the presence of polytomous variables is common, and the generalized logit model can be used to analyze these data. One of the characteristics of the generalized logit model is the assumption that the variance is a known function of the mean, and the observed variance is expected to be close to that assumed by the model. However, it is not uncommon for extra-multinomial variation to occur, due to the systematic observation of data that are more heterogeneous than the variance specified by the model, a phenomenon known as overdispersion. In this context, the present work discusses a diagnostic of overdispersion in multinomial data, with the proposal of a descriptive measure for this problem, as well as presenting a methodological alternative through the Dirichlet- multinomial model. The descriptivemeasure is evaluated through simulation, based on two particular scenarios. As a motivational study, we report an experiment applied to fruit growing, whose objective was to compare the flowering of adult plants of an orange tree, grafted on “Rangpur”lime or “Swingle” citrumelo, with as response variable the classification of branches into three categories: lateral flower, no flower or aborted flower, terminal flower. Through the proposed descriptive measure, evidence of overdispersion was verified, indicating that the generalized logit model may not be the most appropriate. Thus, as a methodological alternative, the Dirichlet-multinomial model was used. Compared to the generalized logit model, the Dirichlet-multinomial proved to be more suitable to fit the data with overdispersion, by allowing the inclusion of an additional parameter to accommodate the excessive extra-multinomial dispersion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.