2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019
DOI: 10.1109/iros40897.2019.8967761
|View full text |Cite
|
Sign up to set email alerts
|

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning

Abstract: We address one-shot imitation learning, where the goal is to execute a previously unseen task based on a single demonstration. While there has been exciting progress in this direction, most of the approaches still require a few hundred tasks for meta-training, which limits the scalability of the approaches. Our main contribution is to formulate one-shot imitation learning as a symbolic planning problem along with the symbol grounding problem. This formulation disentangles the policy execution from the inter-ta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(18 citation statements)
references
References 20 publications
0
18
0
Order By: Relevance
“…For visual inputs, [11] and [21] experimented with 3 separate settings: simulated planer reaching (with different target object colors), simulated planer pushing (with varying target object locations), and real-robot, object-in-hand placing (onto different target containers); [45] set up a two-stage pick-then-place task with varying target objects and target containers; [7] uses a simulated Pick & Place task with 4 objects to pick and 4 target bins to place (hence 16 variations in total). The AI2-THOR [22] environment used in [19] requires collecting varying objects and dropping off at their designated receptacles, where actions are purely semantic concepts such as "dropoff" or "search". In contrast, in this work we consider a harder, multi-task setup, where agent needs to perform well across more diverse and distinct tasks, and generalize not only to new instances of all the seen variations, but also to completely novel tasks.…”
Section: F Further Discussion On Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…For visual inputs, [11] and [21] experimented with 3 separate settings: simulated planer reaching (with different target object colors), simulated planer pushing (with varying target object locations), and real-robot, object-in-hand placing (onto different target containers); [45] set up a two-stage pick-then-place task with varying target objects and target containers; [7] uses a simulated Pick & Place task with 4 objects to pick and 4 target bins to place (hence 16 variations in total). The AI2-THOR [22] environment used in [19] requires collecting varying objects and dropping off at their designated receptacles, where actions are purely semantic concepts such as "dropoff" or "search". In contrast, in this work we consider a harder, multi-task setup, where agent needs to perform well across more diverse and distinct tasks, and generalize not only to new instances of all the seen variations, but also to completely novel tasks.…”
Section: F Further Discussion On Related Workmentioning
confidence: 99%
“…Later work extended OSIL to observe visual inputs: [11] applies the Model-Agnostic Meta-Learning algorithm (MAML) [10] to adapt policy model parameters for new tasks; TecNets [21] applies a hinge rank loss to learn explicit task embeddings; DAML [45] adds a domain-adaptation objective to MAML to use human demonstration videos; [7] improves policy network with Transformer architecture [41]. Another line of work learns modular task structures that can be reused at test time [43] [18] [19], but outputs of these symbolic policies are highly abstracted into semantic action concepts (e.g. "pick", "release") that assume extensive domain knowledge and human-designed priors.…”
Section: Related Work Imitation Learningmentioning
confidence: 99%
“…The effort required to define new actions makes scalability an issue. Future work includes using learning-based approaches to alleviate the engineering bottlenecks, such as learning the preconditions and postconditions of the symbolic actions, as is considered by Huang et al [32], or learning the constraint functions from demonstration.…”
Section: Discussionmentioning
confidence: 99%
“…Given these challenges, its natural to examine the use of learning to improve task and motion planning with real sensing [4,5,6,7,8,9]. However, previous methods fail to solve the full problem of unknown object rearrangement with physical robots.…”
Section: Introductionmentioning
confidence: 99%
“…However, previous methods fail to solve the full problem of unknown object rearrangement with physical robots. Some only operate on known objects [6,9], others ignore or significantly restrict the space of robot control [4,5] or relations [7,8], while still others make assume an explicit goal configuration is given [5]. An alternative approach to solve complex manipulation tasks relies on learning model-free neural net policies instead of explicit models of conditions and effects [10,11].…”
Section: Introductionmentioning
confidence: 99%