Program Search for Machine Learning Pipelines Leveraging Symbolic Planning and Reinforcement Learning

Yang, Fangkai; Gustafson, Steven; Elkholy, Alexander; Lyu, Daoming; Liu, Bo

doi:10.1007/978-3-030-04735-1_11

Cited by 3 publications

(3 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other Reinforcement Learning based methods In [32], the authors also combine pipeline search and hyper-parameter optimization in a reinforcement learning process based on the PEORL [33] framework, however, the hyperparameter is randomly sampled during the reinforcement learning process, an extra stage is needed to sweep the hyper-parameters using hyper-parameter optimization techniques, while in our work, hyper-parameter optimization is embedded in the reinforcement learning process. Alpha3M [14] combined MCTS and recurrent neural network in a self play [27] fashion, however, it seems that Alpha3M does not perform better than the state of art AutoML systems.…”

Section: Reinforcement Learning Based Neural Network Architecture Searchmentioning

confidence: 99%

“…-To our best knowledge, we are the first to embed Bayesian Optimization (BO) into Reinforcement learning, specifically Q Learning [31] for collaborative joint search of pipelines and hyper-parameters, which is different from using BO for policy optimization [12], and also different from using BO for hyper-parameter fine tuning after an optimal pipeline is selected by a reinforcement learning based AutoML framework [32]. -We provide an open source light weight R language implementation reinbo 1 for the R Machine Learning community which could run efficiently on a personal computer, and takes much less resources compared to other Au-toML softwares.…”

Section: Arxiv:190405381v1 [Cslg] 10 Apr 2019mentioning

confidence: 99%

See 1 more Smart Citation

ReinBo: Machine Learning Pipeline Conditional Hierarchy Search and Configuration with Bayesian Optimization Embedded Reinforcement Learning

Sun

Lin

Bischl

2020

Communications in Computer and Information Science

View full text Add to dashboard Cite

Machine learning pipeline potentially consists of several stages of operations like data preprocessing, feature engineering and machine learning model training. Each operation has a set of hyper-parameters, which can become irrelevant for the pipeline when the operation is not selected. This gives rise to a hierarchical conditional hyper-parameter space. To optimize this mixed continuous and discrete conditional hierarchical hyper-parameter space, we propose an efficient pipeline search and configuration algorithm which combines the power of Reinforcement Learning and Bayesian Optimization. Empirical results show that our method performs favorably compared to state of the art methods like Auto-sklearn , TPOT, Tree Parzen Window, and Random Search.

show abstract

Section: Reinforcement Learning Based Neural Network Architecture Searchmentioning

confidence: 99%

Section: Arxiv:190405381v1 [Cslg] 10 Apr 2019mentioning

confidence: 99%

ReinBo: Machine Learning Pipeline Conditional Hierarchy Search and Configuration with Bayesian Optimization Embedded Reinforcement Learning

Sun

Lin

Bischl

2020

Communications in Computer and Information Science

View full text Add to dashboard Cite

show abstract

“…While many methods extract a symbolic mapping for RL from visual data, e.g. (Lyu et al, 2019;Yang et al, 2018Yang et al, , 2019Lu et al, 2018;Garnelo et al, 2016;Li et al, 2018;Liang & Boularias, 2018;Goel et al, 2018), they all require that all of the reward-relevant features are explicitly represented in the symbolic space. As shown by the many successes of Deep RL, e.g.…”

Section: Related Workmentioning

confidence: 99%

Verifiably Safe Exploration for End-to-End Reinforcement Learning

Hunt¹,

Fulton²,

Magliacane³

et al. 2020

Preprint

View full text Add to dashboard Cite

Deploying deep reinforcement learning in safety-critical settings requires developing algorithms that obey hard constraints during exploration. This paper contributes a first approach toward enforcing formal safety constraints on end-to-end policies with visual inputs. Our approach draws on recent advances in object detection and automated reasoning for hybrid dynamical systems. The approach is evaluated on a novel benchmark that emphasizes the challenge of safely exploring in the presence of hard constraints. Our benchmark draws from several proposed problem sets for safe learning and includes problems that emphasize challenges such as reward signals that are not aligned with safety constraints. On each of these benchmark problems, our algorithm completely avoids unsafe behavior while remaining competitive at optimizing for as much reward as is safe. We also prove that our method of enforcing the safety constraints preserves all safe policies from the original environment.

show abstract