We describe an algorithm for the sequential sampling of entries in multiway
contingency tables with given constraints. The algorithm can be used for
computations in exact conditional inference. To justify the algorithm, a theory
relates sampling values at each step to properties of the associated toric
ideal using computational commutative algebra. In particular, the property of
interval cell counts at each step is related to exponents on lead
indeterminates of a lexicographic Gr\"{o}bner basis. Also, the approximation of
integer programming by linear programming for sampling is related to initial
terms of a toric ideal. We apply the algorithm to examples of contingency
tables which appear in the social and medical sciences. The numerical results
demonstrate that the theory is applicable and that the algorithm performs well.Comment: Published at http://dx.doi.org/10.1214/009053605000000822 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The Hardy-Weinberg law is among the most important principles in the study of biological systems. Given its importance, many tests have been devised to determine whether a finite population follows Hardy-Weinberg proportions. Because asymptotic tests can fail, Guo and Thompson developed an exact test; unfortunately, the Monte Carlo method they proposed to evaluate their test has a running time that grows linearly in the size of the population N. Here, we propose a new algorithm whose expected running time is linear in the size of the table produced, and completely independent of N. In practice, this new algorithm can be considerably faster than the original method.
Markov chains and sequential importance sampling (SIS) are described as two leading sampling methods for Monte Carlo computations in exact conditional inference on discrete data in contingency tables. Examples are explained from genotype data analysis, graphical models, and logistic regression. A new Markov chain and implementation of SIS are described for logistic regression.
We present algebraic methods for studying connectivity of Markov moves with margin positivity. The purpose is to develop Markov sampling methods for exact conditional inference in statistical models where a Markov basis is hard to compute. In some cases positive margins are shown to allow a set of Markov connecting moves that are much simpler than the full Markov basis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.