Nonlinear stochastic dynamical systems are widely used to model systems across the sciences and engineering. Such models are natural to formulate and can be analyzed mathematically and numerically. However, difficulties associated with inference from time-series data about unknown parameters in these models have been a constraint on their application. We present a new method that makes maximum likelihood estimation feasible for partially-observed nonlinear stochastic dynamical systems (also known as state-space models) where this was not previously the case. The method is based on a sequence of filtering operations which are shown to converge to a maximum likelihood parameter estimate. We make use of recent advances in nonlinear filtering in the implementation of the algorithm. We apply the method to the study of cholera in Bangladesh. We construct confidence intervals, perform residual analysis, and apply other diagnostics. Our analysis, based upon a model capturing the intrinsic nonlinear dynamics of the system, reveals some effects overlooked by previous studies. maximum likelihood ͉ cholera ͉ time series S tate space models have applications in many areas, including signal processing (1), economics (2), cell biology (3), meteorology (4), ecology (5), neuroscience (6), and various others (7-9). Formally, a state space model is a partially observed Markov process. Real-world phenomena are often well modeled as Markov processes, constructed according to physical, chemical, or economic principles, about which one can make only noisy or incomplete observations.It has been noted repeatedly (1, 10) that estimating parameters for state space models is simplest if the parameters are time-varying random variables that can be included in the state space. Estimation of parameters then becomes a matter of reconstructing unobserved random variables, and inference may proceed by using standard techniques for filtering and smoothing. This approach is of limited value if the true parameters are thought not to vary with time, or to vary as a function of measured covariates rather than as random variables. A major motivation for this work has been the observation that the particle filter (9-13) is a conceptually simple, flexible, and effective filtering technique for which the only major drawback was the lack of a readily applicable technique for likelihood maximization in the case of time-constant parameters. The contribution of this work is to show how time-varying parameter algorithms may be harnessed for use in inference in the fixed-parameter case. The key result, Theorem 1, shows that an appropriate limit of time-varying parameter models can be used to locate a maximum of the fixed-parameter likelihood. This result is then used as the basis for a procedure for finding maximum likelihood estimates for previously intractable models.We use the method to further our understanding of the mechanisms of cholera transmission. Cholera is a disease endemic to India and Bangladesh that has recently become reestablished in Africa, s...
Partially observed Markov process (POMP) models, also known as hidden Markov models or state space models, are ubiquitous tools for time series analysis. The R package pomp provides a very flexible framework for Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models. A range of modern statistical methods for POMP models have been implemented in this framework including sequential Monte Carlo, iterated filtering, particle Markov chain Monte Carlo, approximate Bayesian computation, maximum synthetic likelihood estimation, nonlinear forecasting, and trajectory matching. In this paper, we demonstrate the application of these methodologies using some simple toy problems. We also illustrate the specification of more complex POMP models, using a nonlinear epidemiological model with a discrete population, seasonality, and extra-demographic stochasticity. We discuss the specification of user-defined models and the development of additional methods within the programming environment provided by pomp. *This document is a version of a manuscript in press at the Journal of Statistical Software. It is provided under the Creative Commons Attribution License. Partially observed Markov processesapproximations are adequate for one's purposes, or when the latent process takes values in a small, discrete set, methods that exploit these additional assumptions to advantage, such as the extended and ensemble Kalman filter methods or exact hidden-Markov-model methods, are available, but not yet as part of pomp. It is the class of nonlinear, non-Gaussian POMP models with large state spaces upon which pomp is focused.A POMP model may be characterized by the transition density for the Markov process and the measurement density 1 . However, some methods require only simulation from the transition density whereas others require evaluation of this density. Still other methods may not work with the model itself but with an approximation, such as a linearization. Algorithms for which the dynamic model is specified only via a simulator are said to be plug-and-play (Bretó et al. 2009;He et al. 2010). Plug-and-play methods can be employed once one has "plugged" a model simulator into the inference machinery. Since many POMP models of scientific interest are relatively easy to simulate, the plug-and-play property facilitates data analysis. Even if one candidate model has tractable transition probabilities, a scientist will frequently wish to consider alternative models for which these probabilities are intractable. In a plug-andplay methodological environment, analysis of variations in the model can often be achieved by changing a few lines of the model simulator codes. The price one pays for the flexibility of plugand-play methodology is primarily additional computational effort, which can be substantial. Nevertheless, plug-and-play methods implemented using pomp have proved capable for state of the art inference problems (e.g., King et al. 2008;Bhadra et al. 2011;Shrestha et al. 2011Shrestha et al. , 2013Earn et al. 2012...
In many infectious diseases, an unknown fraction of infections produce symptoms mild enough to go unrecorded, a fact that can seriously compromise the interpretation of epidemiological records. This is true for cholera, a pandemic bacterial disease, where estimates of the ratio of asymptomatic to symptomatic infections have ranged from 3 to 100 (refs 1-5). In the absence of direct evidence, understanding of fundamental aspects of cholera transmission, immunology and control has been based on assumptions about this ratio and about the immunological consequences of inapparent infections. Here we show that a model incorporating high asymptomatic ratio and rapidly waning immunity, with infection both from human and environmental sources, explains 50 yr of mortality data from 26 districts of Bengal, the pathogen's endemic home. We find that the asymptomatic ratio in cholera is far higher than had been previously supposed and that the immunity derived from mild infections wanes much more rapidly than earlier analyses have indicated. We find, too, that the environmental reservoir (free-living pathogen) is directly responsible for relatively few infections but that it may be critical to the disease's endemicity. Our results demonstrate that inapparent infections can hold the key to interpreting the patterns of disease outbreaks. New statistical methods, which allow rigorous maximum likelihood inference based on dynamical models incorporating multiple sources and outcomes of infection, seasonality, process noise, hidden variables and measurement error, make it possible to test more precise hypotheses and obtain unexpected results. Our experience suggests that the confrontation of time-series data with mechanistic models is likely to revise our understanding of the ecology of many infectious diseases.
Statistical inference for mechanistic models of partially observed dynamic systems is an active area of research. Most existing inference methods place substantial restrictions upon the form of models that can be fitted and hence upon the nature of the scientific hypotheses that can be entertained and the data that can be used to evaluate them. In contrast, the so-called plug-and-play methods require only simulations from a model and are thus free of such restrictions. We show the utility of the plug-and-play approach in the context of an investigation of measles transmission dynamics. Our novel methodology enables us to ask and answer questions that previous analyses have been unable to address. Specifically, we demonstrate that plug-and-play methods permit the development of a modelling and inference framework applicable to data from both large and small populations. We thereby obtain novel insights into the nature of heterogeneity in mixing and comment on the importance of including extra-demographic stochasticity as a means of dealing with environmental stochasticity and model misspecification. Our approach is readily applicable to many other epidemiological and ecological systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.