SUMMARY Maximum likelihood estimates (MLEs) in autologistic models and other exponential family models for dependent data can be calculated with Markov chain Monte Carlo methods (the Metropolis algorithm or the Gibbs sampler), which simulate ergodic Markov chains having equilibrium distributions in the model. From one realization of such a Markov chain, a Monte Carlo approximant to the whole likelihood function can be constructed. The parameter value (if any) maximizing this function approximates the MLE. When no parameter point in the model maximizes the likelihood, the MLE in the closure of the exponential family may exist and can be calculated by a two‐phase algorithm, first finding the support of the MLE by linear programming and then finding the distribution within the family conditioned on the support by maximizing the likelihood for that family. These methods are illustrated by a constrained autologistic model for DNA fingerprint data. MLEs are compared with maximum pseudolikelihood estimates (MPLEs) and with maximum conditional likelihood estimates (MCLEs), neither of which produce acceptable estimates, the MPLE because it overestimates dependence, and the MCLE because conditioning removes the constraints.
Markov chain Monte Carlo (MCMC, the Metropolis-Hastings algorithm) has been used for many statistical problems including Bayesian inference, likelihood inference, and tests of significance. Though the method often works well, doubts about convergence remain in all applications. Here we propose MCMC methods distantly related to simulated annealing. Our samplers mix rapidly enough to be usable for problems in which other methods would require eons of computing time. They simulate realizations from a sequence of distributions, allowing the distribution being simulated to vary randomly over time. If the sequence of distributions is well chosen, the sampler will mix well and produce accurate answers for all the distributions. Even when there is only one distribution of interest, these annealinglike samplers may be the only known way to get a rapidly mixing sampler. These methods are essential for attacking very hard problems, which arise in areas such as statistical genetics. We illustrate the methods with an application that is much harder than any problem previously done by Markov chain Monte Carlo. It involves ancestral inference on a very large genealogy (7 generations, 2024 individuals). The problem is to find, conditional on data on living individuals, the probabilities of each individual having been a carrier of cystic fibrosis. The unconditional probabilities are easy to calculate, but exact calculation of the conditional probabilities is infeasible. Moreover, a Gibbs sampler for the problem would not mix in a reasonable time, even on the fastest imaginable computers. Our annealing-like samplers have mixing times of a few hours. We also give examples of samplers for the "witch's hat" distribution and the conditional Strauss process. The methods may also be useful for easier problems. It is a common concern about MCMC that one can never be sure that that the chain was well mixed and the answers are correct. Although we have no guaranteed convergence bounds for our methods, it does seem that annealing-like samplers are overkill in easy problems and should dispel doubts about convergence.
Markov chain Monte Carlo (MCMC, the Metropolis-Hastings algorithm) has been used for many statistical problems including Bayesian inference, likelihood inference, and tests of significance. Though the method often works well, doubts about convergence remain in all applications. Here we propose MCMC methods distantly related to simulated annealing. Our samplers mix rapidly enough to be usable for problems in which other methods would require eons of computing time. They simulate realizations from a sequence of distributions, allowing the distribution being simulated to vary randomly over time. If the sequence of distributions is well chosen, the sampler will mix well and produce accurate answers for all the distributions. Even when there is only one distribution of interest, these annealinglike samplers may be the only known way to get a rapidly mixing sampler.These methods are essential for attacking very hard problems, which arise in areas such as statistical genetics. We illustrate the methods with an application that is much harder than any problem previously done by Markov chain Monte Carlo. It involves ancestral inference on a very large genealogy (7 generations, 2024 individuals). The problem is to find, conditional on data on living individuals, the probabilities of each individual having been a carrier of cystic fibrosis. The unconditional probabilities are easy to calculate, but exact calculation of the conditional probabilities is infeasible. Moreover, a Gibbs sampler for the problem would not mix in a reasonable time, even on the fastest imaginable computers. Our annealing-like samplers have mixing times of a few hours. We also give examples of samplers for the "witch's hat" distribution and the conditional Strauss process.The methods may also be useful for easier problems. It is a common concern about MCMC that one can never be sure that that the chain was well mixed and the answers are correct. Although we have no guaranteed convergence bounds for our methods, it does seem that annealing-like samplers are overkill in easy problems and should dispel doubts about convergence.
The lifetime fitnesses of individuals comprising a population determine its numerical dynamics, and genetic variation in fitness results in evolutionary change. The dual importance of individual fitness is well understood, but empirical fitness records generally generally violate the assumptions of standard statistical approaches. This problem has plagued comprehensive study of fitness and impeded empirical study of the link between numerical and genetic dynamics of populations. Recently developed aster models address this problem by explicitly modeling the dependence of later expressed components of fitness (e.g. fecundity) on those expressed earlier (e.g. survival to reproduce). Moreover, aster models employ different sampling distributions for components of fitness, as appropriate (e.g. binomial for survival over a given interval and Poisson for fecundity). The analysis is conducted by maximum likelihood, and the resulting compound distributions for lifetime fitness closely approximate the observed data. We illustrate the breadth of aster's utility with three examples demonstrating estimation of the finite rate of increase, comparison of mean fitness among genotypic groups, and phenotypic selection analysis. Aster models offer a unified approach to address the breadth of questions in evolution and ecology for which life history data are gathered.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.