Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the accumulation of new data, and provide particle learning algorithms that allow for the efficient on-line posterior filtering of tree-states. A major advantage of tree regression is that it allows for the use of very simple models within each partition. We consider both constant and linear mean functions at the tree leaves, along with multinomial leaves for classification problems, and propose default prior specifications that allow for prediction to be integrated over all model parameters conditional on a given tree. Inference is illustrated in some standard nonparametric regression examples, as well as in the setting of sequential experiment design, including both active learning and optimization applications, and in on-line classification. We detail implementation guidelines and problem specific methodology for each of these motivating applications. Throughout, it is demonstrated that our practical approach is able to provide better results compared to commonly used methods at a fraction of the cost.
This document describes the new features in version 2.x of the tgp package for R, implementing treed Gaussian process (GP) models. The topics covered include methods for dealing with categorical inputs and excluding inputs from the tree or GP part of the model; fully Bayesian sensitivity analysis for inputs/covariates; sequential optimization of black-box functions; and a new Monte Carlo method for inference in multi-modal posterior distributions that combines simulated tempering and importance sampling. These additions extend the functionality of tgp across all models in the hierarchy: from Bayesian linear models, to classification and regression trees (CART), to treed Gaussian processes with jumps to the limiting linear model. It is assumed that the reader is familiar with the baseline functionality of the package, outlined in the first vignette (Gramacy 2007).
This article develops a set of tools for smoothing and prediction with dependent point event patterns. The methodology is motivated by the problem of tracking weekly maps of violent crime events, but is designed to be straightforward to adapt to a wide variety of alternative settings. In particular, a Bayesian semiparametric framework is introduced for modeling correlated time series of marked spatial Poisson processes. The likelihood is factored into two independent components: the set of total integrated intensities and a series of process densities. For the former it is assumed that Poisson intensities are realizations from a dynamic linear model. In the latter case, a novel class of dependent stick-breaking mixture models are proposed to allow nonparametric density estimates to evolve in discrete time. This, a simple and flexible new model for dependent random distributions, is based on autoregressive time series of marginally beta random variables applied as correlated stick-breaking proportions. The approach allows for marginal Dirichlet process priors at each time and adds only a single new correlation term to the static model specification. Sequential Monte Carlo algorithms are described for on-line inference with each model component, and marginal likelihood calculations form the basis for inference about parameters governing temporal dynamics. Simulated examples are provided to illustrate the methodology, and we close with results for the motivating application of tracking violent crime in Cincinnati.
We develop a Bayesian method for nonparametric model-based quantile regression. The approach involves flexible Dirichlet process mixture models for the joint distribution of the response and the covariates, with posterior inference for different quantile curves emerging from the conditional response distribution given the covariates. An extension to allow for partially observed responses leads to a novel Tobit quantile regression framework. We use simulated data sets and two data examples from the literature to illustrate the capacity of the model to uncover nonlinearities in quantile regression curves, as well as nonstandard features in the response distribution.KEY WORDS: Dirichlet process mixture model; Markov chain Monte Carlo; Multivariate normal mixture; Tobit quantile regression.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.