Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence 2018
DOI: 10.24963/ijcai.2018/675
|View full text |Cite
|
Sign up to set email alerts
|

PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making

Abstract: Reinforcement learning and symbolic planning have both been used to build intelligent autonomous agents. Reinforcement learning relies on learning from interactions with real world, which often requires an unfeasibly large amount of experience. Symbolic planning relies on manually crafted symbolic knowledge, which may not be robust to domain uncertainties and changes. In this paper we present a unified framework PEORL that integrates symbolic planning with hierarchical reinforcement learning (HRL) to cope with… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
58
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 81 publications
(64 citation statements)
references
References 0 publications
0
58
0
1
Order By: Relevance
“…Other Reinforcement Learning based methods In [32], the authors also combine pipeline search and hyper-parameter optimization in a reinforcement learning process based on the PEORL [33] framework, however, the hyperparameter is randomly sampled during the reinforcement learning process, an extra stage is needed to sweep the hyper-parameters using hyper-parameter optimization techniques, while in our work, hyper-parameter optimization is embedded in the reinforcement learning process. Alpha3M [14] combined MCTS and recurrent neural network in a self play [27] fashion, however, it seems that Alpha3M does not perform better than the state of art AutoML systems.…”
Section: Reinforcement Learning Based Neural Network Architecture Searchmentioning
confidence: 99%
“…Other Reinforcement Learning based methods In [32], the authors also combine pipeline search and hyper-parameter optimization in a reinforcement learning process based on the PEORL [33] framework, however, the hyperparameter is randomly sampled during the reinforcement learning process, an extra stage is needed to sweep the hyper-parameters using hyper-parameter optimization techniques, while in our work, hyper-parameter optimization is embedded in the reinforcement learning process. Alpha3M [14] combined MCTS and recurrent neural network in a self play [27] fashion, however, it seems that Alpha3M does not perform better than the state of art AutoML systems.…”
Section: Reinforcement Learning Based Neural Network Architecture Searchmentioning
confidence: 99%
“…Integrating robot task planning and learning of navigation costs has also been investigated [15]. Recent approaches such as PEORL [35] and SDRL [25] utilize closed-loop communication between planning and learning: an optimal symbolic plan is obtained from an iterative process of planning and learning, so that planning and learning can mutually benefit each other. However, most of these approaches have only been applied to artificial domains.…”
Section: Related Workmentioning
confidence: 99%
“…These approaches were based on integrating symbolic planning with value iteration methods of reinforcement learning, and in their work, there was no bidirectional communication loop between planning and learning so that they could not mutually benefit each other. The latest work in this direction is PEORL framework [42] and SDRL [21], where ASP-based planning was integrated with R-learning [35] into planning-learning loop. PACMAN architecture is a new framework of integrating symbolic planning with RL, in particular, integrating planning with AC algorithm for the first time, and also features bidirectional communication between planning and learning.…”
Section: Related Workmentioning
confidence: 99%
“…From the first perspective, research from KR community on modular action languages [20,5,10] proposed formal languages to encode a general-purpose library of actions that can be used to define a wide range of benchmark planning problems as special cases, leading to a representation that is elaboration tolerant and addressing the problem of generality of AI [24]. Meanwhile, researchers from the RL community focused on incorporating high-level abstraction into flat RL, leading to options framework for hierarchical RL [2], hierarchical abstract machines [27], and more recently, works that integrate symbolic knowledge represented in answer set programming (ASP) into reinforcement learning framework [19,42,21,11]. From the second perspective, imitation learning, including learning from demonstration (LfD) [1] and inverse reinforcement learning (IRL) [26] tried to learn policies from examples of a human expert, or learn directly from human feedback [39,15,6], a.k.a, human-centered reinforcement learning (HCRL).…”
Section: Introductionmentioning
confidence: 99%