Modern parallel computing devices, such as the graphics processing unit (GPU), have gained significant traction in scientific and statistical computing. They are particularly well-suited to dataparallel algorithms such as the particle filter, or more generally Sequential Monte Carlo (SMC), which are increasingly used in statistical inference. SMC methods carry a set of weighted particles through repeated propagation, weighting and resampling steps. The propagation and weighting steps are straightforward to parallelise, as they require only independent operations on each particle. The resampling step is more difficult, as standard schemes require a collective operation, such as a sum, across particle weights. Focusing on this resampling step, we analyse two alternative schemes that do not involve a collective operation (Metropolis and rejection resamplers), and compare them to standard schemes (multinomial, stratified and systematic resamplers). We find that, in certain circumstances, the alternative resamplers can perform significantly faster on a GPU, and to a lesser extent on a CPU, than the standard approaches. Moreover, in single precision, the standard approaches are numerically biased for upwards of hundreds of thousands of particles, while the alternatives are not. This is particularly important given greater single-than double-precision throughput on modern devices, and the consequent temptation to use single precision with a greater number of particles. Finally, we provide auxiliary functions useful for implementation, such as for the permutation of ancestry vectors to enable in-place propagation.
LibBi is a software package for state-space modelling and Bayesian inference on modern computer hardware, including multi-core central processing units (CPUs), many-core graphics processing units (GPUs) and distributed-memory clusters of such devices. The software parses a domain-specific language for model specification, then optimises, generates, compiles and runs code for the given model, inference method and hardware platform. In presenting the software, this work serves as an introduction to state-space models and the specialised methods developed for Bayesian inference with them. The focus is on sequential Monte Carlo (SMC) methods such as the particle filter for state estimation, and the particle Markov chain Monte Carlo (PMCMC) and SMC 2 methods for parameter estimation. All are well-suited to current computer hardware. Two examples are given and developed throughout, one a linear three-element windkessel model of the human arterial system, the other a nonlinear Lorenz '96 model. These are specified in the prescribed modelling language, and LibBi demonstrated by performing inference with them. Empirical results are presented, including a performance comparison of the software with different hardware configurations.
The benefits of sequestering carbon are many, including improved crop productivity, reductions in greenhouse gases, and financial gains through the sale of carbon credits. Achieving better understanding of the sequestration process has motivated many deterministic models of soil carbon dynamics, but none of these models addresses uncertainty in a comprehensive manner. Uncertainty arises in many ways -around the model inputs, parameters, and dynamics, and subsequently model predictions. In this paper, these uncertainties are addressed in concert by incorporating a physical-statistical model for carbon dynamics within a Bayesian hierarchical modelling framework. This comprehensive approach to accounting for uncertainty in soil carbon modelling has not been attempted previously. This paper demonstrates proof-ofconcept based on a one-pool model and identifies requirements for extension to multi-pool carbon modelling. Our model is based on the soil carbon dynamics in Tarlee, South Australia. We specify the model conditionally through its parameters, soil carbon input and decay processes, and observations of those processes. We use a particle marginal Metropolis-Hastings approach specified using the LibBi modelling language. We highlight how samples from the posterior distribution can be used to summarise our knowledge about model parameters, to estimate the probabilities of sequestering carbon, and to forecast changes in carbon stocks under crop rotations not represented explicitly in the original field trials. National Institute for Applied Statistics Research AustraliaThe University of Wollongong Working Paper 04-14Re-thinking soil carbon modelling: A stochastic approach to quantify uncertainties The benefits of sequestering carbon are many, including improved crop productivity, reductions in greenhouse gases, and financial gains through the sale of carbon credits. Achieving better understanding of the sequestration process has motivated many deterministic models of soil carbon dynamics, but none of these models addresses uncertainty in a comprehensive manner. Uncertainty arises in many ways -around the model inputs, parameters, and dynamics, and subsequently model predictions. In this paper, these uncertainties are addressed in concert by incorporating a physical-statistical model for carbon dynamics within a Bayesian hierarchical modelling framework. This comprehensive approach to accounting for uncertainty in soil carbon modelling has not been attempted previously. This paper demonstrates proof-ofconcept based on a one-pool model and identifies requirements for extension to multi-pool carbon modelling.Our model is based on the soil carbon dynamics in Tarlee, South Australia. We specify the model conditionally through its parameters, soil carbon input and decay processes, and observations of those processes. We use a particle marginal Metropolis-Hastings approach specified using the LibBi modelling language. We highlight how samples from the posterior distribution can be used to summarise our knowledge a...
In modern applications, statisticians are faced with integrating heterogeneous data modalities relevant for an inference, prediction, or decision problem. In such circumstances, it is convenient to use a graphical model to represent the statistical dependencies, via a set of connected 'modules', each relating to a specific data modality, and drawing on specific domain expertise in their development.In principle, given data, the conventional statistical update then allows for coherent uncertainty quantification and information propagation through and across the modules. However, misspecification of any module can contaminate the estimate and update of others, often in unpredictable ways. In various settings, particularly when certain modules are trusted more than others, practitioners have preferred to avoid learning with the full model in favor of approaches that restrict the information propagation between modules, for example by restricting propagation to only particular directions along the edges of the graph. In this article, we investigate why these modular approaches might be preferable to the full model in misspecified settings. We propose principled criteria to choose between modular and full-model approaches. The question arises in many applied settings, including large stochastic dynamical systems, meta-analysis, epidemiological models, air pollution models, pharmacokinetics-pharmacodynamics, and causal inference with propensity scores.
This article considers the problem of storing the paths generated by a particle filter and more generally by a sequential Monte Carlo algorithm. It provides a theoretical result bounding the expected memory cost by $T + C N \log N$ where $T$ is the time horizon, $N$ is the number of particles and $C$ is a constant, as well as an efficient algorithm to realise this. The theoretical result and the algorithm are illustrated with numerical experiments.Comment: 9 pages, 5 figures. To appear in Statistics and Computin
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.