1980
DOI: 10.2307/2287455
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Investigation of Goodness-of-Fit Statistics for Sparse Multinomials

Abstract: Traditional discussions of goodness-of-fit tests for multinomial data consider asymptotic chi-squared properties under the assumption that all expected cell frequencies become large. However, this condition is not always satisfied and other asymptotic theories must be considered. For testing a specified simple hypothesis, Morris gave conditions for the asymptotic normality of the Pearson and likelihood ratio statistics when both the sample size and number of cells become large (even if •the expected cell frequ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

9
93
0
2

Year Published

1988
1988
2019
2019

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 122 publications
(104 citation statements)
references
References 0 publications
9
93
0
2
Order By: Relevance
“…Note that p -values are not reported for the test statistics in these models because the degrees of freedom (df) are large (df≥99 for each model). Large models suffer from sparseness in the observed data table; when data are sparse it has been shown that the distribution of the G 2 does not follow a chi-square distribution (Koehler, 1986; Koehler & Larntz, 1980). Lower AIC and BIC values reflect an optimal balance between model fit and parsimony.…”
Section: Resultsmentioning
confidence: 99%
“…Note that p -values are not reported for the test statistics in these models because the degrees of freedom (df) are large (df≥99 for each model). Large models suffer from sparseness in the observed data table; when data are sparse it has been shown that the distribution of the G 2 does not follow a chi-square distribution (Koehler, 1986; Koehler & Larntz, 1980). Lower AIC and BIC values reflect an optimal balance between model fit and parsimony.…”
Section: Resultsmentioning
confidence: 99%
“…Furthermore, the number of communities involved tends to produce a table with a larger number of rows than is typical of a cross-tabulation table in most analytical applications. One early paper suggests that χ 2 may still function adequately with low frequencies of observations, although perhaps with a correction (i.e., using df = k - 2 instead of k - 1) [33]. More work may be required to adequately assess how well IP2 functions with zero counts, and it may possibly need to be adjusted using two stage models [34], mixture models [35], or a generalized Poisson distribution [36].…”
Section: Discussionmentioning
confidence: 99%
“…If that number is approaching zero for at least one cell, then we can expect that the statistic will behave aberrantly. For instance, [9] cites research showing that the asymptotic behavior of Xn2 is not as expected, giving some explanation for the well-known rule of thumb of having at least five as the expected count in each cell (though subsequent work has shown this particular cutoff to be conservative [10]).…”
Section: Bin Selection For Xn2mentioning
confidence: 99%