Alexey Radul scite author profile

Alexey Radul

5Publications

99Citation Statements Received

59Citation Statements Given

How they've been cited

114

How they cite others

136

Affiliations

Google (United States), Massachusetts Institute of Technology, Moscow Institute of Thermal Technology

Publications

Order By: Most citations

Probabilistic programming with programmable inference

Mansinghka

Schaechtle

Handa

et al. 2018

View full text Add to dashboard Cite

We describe Venture, an interactive virtual machine for probabilistic programming that aims to be sufficiently expressive, extensible, and efficient for general-purpose use. Like Church, probabilistic models and inference problems in Venture are specified via a Turing-complete, higher-order probabilistic language descended from Lisp. Unlike Church, Venture also provides a compositional language for custom inference strategies, assembled from scalable implementations of several exact and approximate techniques. Venture is thus applicable to problems involving widely varying model families, dataset sizes and runtime/accuracy constraints. We also describe four key aspects of Venture's implementation that build on ideas from probabilistic graphical models. First, we describe the stochastic procedure interface (SPI) that specifies and encapsulates primitive random variables, analogously to conditional probability tables in a Bayesian network. The SPI supports custom control flow, higher-order probabilistic procedures, partially exchangeable sequences and "likelihood-free" stochastic simulators, all with custom proposals. It also supports the integration of external models that dynamically create, destroy and perform inference over latent variables hidden from Venture. Second, we describe probabilistic execution traces (PETs), which represent execution histories of Venture programs. Like Bayesian networks, PETs capture conditional dependencies, but PETs also represent existential dependencies and exchangeable coupling. Third, we describe partitions of execution histories called scaffolds that can be efficiently constructed from PETs and that factor global inference problems into coherent sub-problems. Finally, we describe a family of stochastic regeneration algorithms for efficiently modifying PET fragments contained within scaffolds without visiting conditionally independent random choices. Stochastic regeneration insulates inference algorithms from the complexities introduced by changes in execution structure, with runtime that scales linearly in cases where previous approaches often scaled quadratically and were therefore impractical. We show how to use stochastic regeneration and the SPI to implement general-purpose inference strategies such as Metropolis-Hastings, Gibbs sampling, and blocked proposals based on hybrids with both particle Markov chain Monte Carlo and mean-field variational inference techniques.

show abstract

AD in Fortran: Implementation via Prepreprocessor

Radul

Pearlmutter

Siskind

2012

View full text Add to dashboard Cite

We describe an implementation of the FARFEL FORTRAN AD extensions (Radul et al., 2012). These extensions integrate forward and reverse AD directly into the programming model, with attendant benefits to flexibility, modularity, and ease of use. The implementation we describe is a "prepreprocessor" that generates input to existing FORTRAN-based AD tools. In essence, blocks of code which are targeted for AD by FARFEL constructs are put into subprograms which capture their lexical variable context, and these are closure-converted into top-level subprograms and specialized to eliminate EXTERNAL arguments, rendering them amenable to existing AD preprocessors, which are then invoked, possibly repeatedly if the AD is nested.

show abstract

Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming

Paszke¹,

Johnson

Duvenaud

et al. 2021

Proc. ACM Program. Lang.

View full text Add to dashboard Cite

We present a novel programming language design that attempts to combine the clarity and safety of high-level functional languages with the efficiency and parallelism of low-level numerical languages. We treat arrays as eagerly-memoized functions on typed index sets, allowing abstract function manipulations, such as currying, to work on arrays. In contrast to composing primitive bulk-array operations, we argue for an explicit nested indexing style that mirrors application of functions to arguments. We also introduce a fine-grained typed effects system which affords concise and automatically-parallelized in-place updates. Specifically, an associative accumulation effect allows reverse-mode automatic differentiation of in-place updates in a way that preserves parallelism. Empirically, we benchmark against the Futhark array programming language, and demonstrate that aggressive inlining and type-driven compilation allows array programs to be written in an expressive, "pointful" style with little performance penalty.

show abstract

Perturbation confusion in forward automatic differentiation of higher-order functions

Manzyuk¹,

Pearlmutter²,

Radul

et al. 2019

J. Funct. Prog.

View full text Add to dashboard Cite

Automatic Differentiation (AD) is a technique for augmenting computer programs to compute derivatives. The essence of AD in its forward accumulation mode is to attach perturbations to each number, and propagate these through the computation by overloading the arithmetic operators. When derivatives are nested, the distinct derivative calculations, and their associated perturbations, must be distinguished. This is typically accomplished by creating a unique tag for each derivative calculation and tagging the perturbations. We exhibit a subtle bug, present in fielded implementations which support derivatives of higher-order functions, in which perturbations are confused despite the tagging machinery, leading to incorrect results. The essence of the bug is this: a unique tag is needed for each derivative calculation, but in existing implementations unique tags are created when taking the derivative of a function at a point. When taking derivatives of higher-order functions, these need not correspond! We exhibit a simple example: a higher-order function f whose derivative at a point x, namely f ′ (x), is itself a function which calculates a derivative. This situation arises naturally when taking derivatives of curried functions. Two potential solutions are presented, and their deficiencies discussed. One uses eta expansion to delay the creation of fresh tags in order to put them into oneto-one correspondence with derivative calculations. The other wraps outputs of derivative operators with tag substitution machinery. Both solutions seem very difficult to implement without violating the desirable complexity guarantees of Forward AD.

show abstract

Probabilistic programming with programmable inference

et al. 2018

View full text Add to dashboard Cite

We introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded into the language implementation as in previous probabilistic programming languages, infer- ence metaprogramming enables developers to 1) dynamically decompose inference problems into subproblems, 2) apply in- ference tactics to subproblems, 3) alternate between incorpo- rating new data and performing inference over existing data, and 4) explore multiple execution traces of the probabilis- tic program at once. Implemented tactics include gradient- based optimization, Markov chain Monte Carlo, variational inference, and sequental Monte Carlo techniques. Inference metaprogramming enables the concise expression of proba- bilistic models and inference algorithms across diverse elds, such as computer vision, data science, and robotics, within a single probabilistic programming language.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alexey Radul

Probabilistic programming with programmable inference

AD in Fortran: Implementation via Prepreprocessor

Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming

Perturbation confusion in forward automatic differentiation of higher-order functions

Probabilistic programming with programmable inference

Contact Info

Product

Resources

About