The probability density function of a probability distribution is a fundamental concept in probability theory and a key ingredient in various widely used machine learning methods. However, the necessary framework for compiling probabilistic functional programs to density functions has only recently been developed. In this work, we present a density compiler for a probabilistic language with failure and both discrete and continuous distributions, and provide a proof of its soundness. The compiler greatly reduces the development effort of domain experts, which we demonstrate by solving inference problems from various scientific applications, such as modelling the global carbon cycle, using a standard Markov chain Monte Carlo framework.computing PDFs for a large class of programs written in a rich probabilistic programming language. An abridged version of this paper was published as (Bhat et al., 2013).Probability density functions. In this work, probabilistic programs correspond directly to probability distributions, which are important because they are a powerful formalism for data analysis. However, many techniques we would like to use require the probability density function of a distribution instead of the distribution itself. Unfortunately, not every distribution has a density function.Distributions. One interpretation of a probabilistic program is that it is a simulation that can be run to generate a random sample from some set Ω of possible outcomes. The corresponding probability distribution P characterizes the program by assigning probabilities to different subsets of Ω (events). The probability P(A) for a subset A of Ω corresponds to the proportion of runs that generate an outcome in A, in the limit of an infinite number of repeated runs of the simulation.Consider for example a simple mixture of Gaussians, here written in Fun (Borgström et al., 2011), a probabilistic functional language embedded within F# (Syme et al., 2007).if flip 0.7 then random(Gaussian(0.0, 1.0)) else random(Gaussian(4.0, 1.0))The program above specifies a distribution on the real line (Ω is R) and corresponds to a generative process that flips a biased coin and then generates a number from one of two Gaussian distributions, both with standard deviation 1.0 but with mean either 0.0 or 4.0 depending on the result of the coin toss. In this example, we will be more likely to produce a value near 0.0 than near 4.0 because of the bias. The probability P(A) for A = [0, 1], for instance, is the proportion of runs that generate a number between 0 and 1.Densities. A distribution P is a function that takes subsets of Ω as input, but for many purposes it turns out to be more convenient if we can find a function f that takes elements of Ω directly, while still somehow capturing the same information as P.When Ω is the real line, we are interested in a function f that satisfies P(A) = A f (x) dx for all intervals A, and we call f the probability density function (PDF) of the distribution P. In other words, f is a function where the area under its ...