2022
DOI: 10.48550/arxiv.2204.12471
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Coarse-to-fine Q-attention with Tree Expansion

Abstract: Coarse-to-fine Q-attention enables sample-efficient robot manipulation by discretizing the translation space in a coarse-to-fine manner, where the resolution gradually increases at each layer in the hierarchy. Although effective, Q-attention suffers from "coarse ambiguity" -when voxelization is significantly coarse, it is not feasible to distinguish similar-looking objects without first inspecting at a finer resolution. To combat this, we propose to envision Q-attention as a tree that can be expanded and used … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…Following James and Davison [39], we fill a replay buffer with 50 or 100 expert demonstrations. Unlike prior approaches that utilize path planner with the policy to output next best gripper pose [39,3,21,40,41], our RL agent outputs relative change in gripper position. We provide further details in Appendix A. : Aggregate success rate on five multi-view and single-view control tasks and two viewpointrobust control tasks.…”
Section: Methodsmentioning
confidence: 99%
“…Following James and Davison [39], we fill a replay buffer with 50 or 100 expert demonstrations. Unlike prior approaches that utilize path planner with the policy to output next best gripper pose [39,3,21,40,41], our RL agent outputs relative change in gripper position. We provide further details in Appendix A. : Aggregate success rate on five multi-view and single-view control tasks and two viewpointrobust control tasks.…”
Section: Methodsmentioning
confidence: 99%
“…Akbulut et al [2] introduce a new framework called Adaptive Conditional Neural Movement Primitives, combining supervised learning and RL to conserve old skills learned from robot demonstrations while being adaptive to new environments. James and Davison [51] present a coarse-to-fine discrete RL algorithm to solve sparse reward manipulation tasks by using only a small amount of demonstration and exploration data (work extended by [49] and [50]). Celemin et al [19] include human corrective advice in the action domain through a learning-from-demonstration approach, while an RL algorithm guides the learning process by filtering out human feedback that does not maximize the reward.…”
Section: Learning From Demonstrationmentioning
confidence: 99%
“…C2F-ARM takes this next-best pose and uses a motion planner to take the robot to the goal pose. In this work, we use the original C2F-ARM algorithm, and do not include any of its subsequent extensions, e.g., learned path ranking [66] and tree expansion [67].…”
Section: Additional Rlbench Remarksmentioning
confidence: 99%