In statistical models, the generalized linear model (GLMs) plays a role in studying to describe a response variable as a function of 1 or more predictor variables. Computational methods and mixed distributions are frequently used to build predictive models to perform time-to-event data analysis. To develop a statistical model so that the model can make predictions appropriately and accurately, it starts with developing a suitable distribution for the nature of the actual data. This paper proposes a new mixed negative binomial distribution for count data with over-dispersion, the so-called negative binomial-quasi Lindley (NB-QL) distribution. A new GLMs framework for the NB-QL model to build the time series count data model is introduced, and its application is carried out based on the actual data sets of the COVID-19 epidemic in Thailand. The models are related to GLMs as they are linear relationships between outcome variables and covariates. Where the response variable was in the form of time series count data under the exponential family distribution function, with the random components and link functions. In this study, we study the factors that affect the number of COVID-19 death cases in Thailand and provide the predictive modeling of the number of the COVID-19 death cases from 1 January 2020 to 31 December 2020, for which this data set has the observed sample of 366 days. In contrast, a model with an NB-QL distribution and NB has approached the uniform. Based on the deviance, DIC, and the probability integral transform histogram, we can see that the proposed model is also suitable for forecasting the number of the COVID-19 death cases daily in Thailand, indicating that the NB-QL time series model was another efficient alternative to modeling count data that has an over-dispersion problem. According to the NB-QL time series model about the number of the COVID-19 death cases daily in Thailand, it is indicated that the average number of daily COVID-19 deaths is influenced by the number of the COVID-19 death cases in the previous 3 days. The average number of COVID-19 death cases in Thailand is also influenced by the previous 2 days. At the same time, the number of infected cases daily in Thailand is influenced by the number of the COVID-19 death cases daily. In addition, there are also the components interventions of internal covariate effects due to the data, as there was a surge in the number of the COVID-19 death cases daily in Thailand at the time.
HIGHLIGHTS
A new mixture NB distribution to be a flexible alternative to analyze count data with over-dispersion. The new distribution is a mix of the NB and QL distributions; a name is Negative Binomial-Quasi Lindley
A new mixed negative binomial distribution for time series count data with over-dispersion, and the Bayesian approach is the method used to estimate the parameters of the proposed model. We will apply the GLMs framework to build the time series count data
The new mixed NB distribution in this study is an extremely effective alternative for modeling count data in the context of over-dispersion
GRAPHICAL ABSTRACT