Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning

Sun, Xudong; Bischl, Bernd

doi:10.1109/ssci44817.2019.9003114

Cited by 9 publications

(2 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Statistical modelling has been extremely successful in the context of modern machine learning and moment matching is an efficient technique to do so. We refer the reader to the surveys and references therein [21,38,51,62].…”

Section: Random Feature Injectionmentioning

confidence: 99%

Arda

Chepurko¹,

Marcus²,

Zgraggen³

et al. 2020

Proc. VLDB Endow.

View full text Add to dashboard Cite

Automatic machine learning (AML) is a family of techniques to automate the process of training predictive models, aiming to both improve performance and make machine learning more accessible. While many recent works have focused on aspects of the machine learning pipeline like model selection, hyperparameter tuning, and feature selection, relatively few works have focused on automatic data augmentation. Automatic data augmentation involves finding new features relevant to the user's predictive task with minimal "human-in-the-loop" involvement. We present ARDA, an end-to-end system that takes as input a dataset and a data repository, and outputs an augmented data set such that training a predictive model on this augmented dataset results in improved performance. Our system has two distinct components: (1) a framework to search and join data with the input data, based on various attributes of the input, and (2) an efficient feature selection algorithm that prunes out noisy or irrelevant features from the resulting join. We perform an extensive empirical evaluation of different system components and benchmark our feature selection algorithm on real-world datasets.

show abstract

Section: Random Feature Injectionmentioning

confidence: 99%

Arda

Chepurko¹,

Marcus²,

Zgraggen³

et al. 2020

Proc. VLDB Endow.

View full text Add to dashboard Cite

show abstract

“…DEC-POMDP can be solved using control as inference [ 16 , 17 ]. Control as inference is a framework to interpret a control problem as an inference problem by introducing auxiliary variables [ 18 , 19 , 20 , 21 , 22 ]. Although control as inference has several variants, Toussaint and Storkey showed that the planning of MDP can be interpreted as the maximum likelihood estimation for a latent variable model [ 18 ].…”

Section: Introductionmentioning

confidence: 99%

Forward and Backward Bellman Equations Improve the Efficiency of the EM Algorithm for DEC-POMDP

Tottori

Kobayashi

2021

Entropy

View full text Add to dashboard Cite

Decentralized partially observable Markov decision process (DEC-POMDP) models sequential decision making problems by a team of agents. Since the planning of DEC-POMDP can be interpreted as the maximum likelihood estimation for the latent variable model, DEC-POMDP can be solved by the EM algorithm. However, in EM for DEC-POMDP, the forward–backward algorithm needs to be calculated up to the infinite horizon, which impairs the computational efficiency. In this paper, we propose the Bellman EM algorithm (BEM) and the modified Bellman EM algorithm (MBEM) by introducing the forward and backward Bellman equations into EM. BEM can be more efficient than EM because BEM calculates the forward and backward Bellman equations instead of the forward–backward algorithm up to the infinite horizon. However, BEM cannot always be more efficient than EM when the size of problems is large because BEM calculates an inverse matrix. We circumvent this shortcoming in MBEM by calculating the forward and backward Bellman equations without the inverse matrix. Our numerical experiments demonstrate that the convergence of MBEM is faster than that of EM.

show abstract