We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are heteroskedastic, and observe that the posterior puts its mass on ever more high-dimensional models as the sample size increases. To remedy the problem, we equip the likelihood in Bayes' theorem with an exponent called the learning rate, and we propose the Safe Bayesian method to learn the learning rate from the data. SafeBayes tends to select small learning rates as soon the standard posterior is not 'cumulatively concentrated', and its results on our data are quite encouraging.
An important goal common to domain adaptation and causal inference is to make accurate predictions when the distributions for the source (or training) domain(s) and target (or test) domain(s) differ. In many cases, these different distributions can be modeled as different contexts of a single underlying system, in which each distribution corresponds to a different perturbation of the system, or in causal terms, an intervention. We focus on a class of such causal domain adaptation problems, where data for one or more source domains are given, and the task is to predict the distribution of a certain target variable from measurements of other variables in one or more target domains. We propose an approach for solving these problems that exploits causal inference and does not rely on prior knowledge of the causal graph, the type of interventions or the intervention targets. We demonstrate our approach by evaluating a possible implementation on simulated and real world data.
This paper discusses an alternative to conditioning that may be used when the probability distribution is not fully specified. It does not require any assumptions (such as CAR: coarsening at random) on the unknown distribution. The well-known Monty Hall problem is the simplest scenario where neither naive conditioning nor the CAR assumption suffice to determine an updated probability distribution. This paper thus addresses a generalization of that problem to arbitrary distributions on finite outcome spaces, arbitrary sets of 'messages', and (almost) arbitrary loss functions, and provides existence and characterization theorems for robust probability updating strategies. We find that for logarithmic loss, optimality is characterized by an elegant condition, which we call RCAR (reverse coarsening at random). Under certain conditions, the same condition also characterizes optimality for a much larger class of loss functions, and we obtain an objective and general answer to how one should update probabilities in the light of new information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.