2021
DOI: 10.48550/arxiv.2110.12618
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning Insertion Primitives with Discrete-Continuous Hybrid Action Space for Robotic Assembly Tasks

Abstract: This paper introduces a discrete-continuous action space to learn insertion primitives for robotic assembly tasks. Primitive is a sequence of elementary actions with certain exit conditions, such as "pushing down the peg until contact". Since the primitive is an abstraction of robot control commands and encodes human prior knowledge, it reduces the exploration difficulty and yields better learning efficiency. In this paper, we learn robot assembly skills via primitives. Specifically, we formulate insertion pri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…Efforts using MuJoCo or the robosuite extension [112] include [19,22,29,37,38,48,82,88,96,97,99,108,110,111]. Simulated rigid-body assembly tasks are limited to peg-inhole insertion of large pegs with round, triangular, square, and prismatic cross-sections, lap-joint mating, and one non-convex insertion [29].…”
Section: B Robotic Assembly Simulationmentioning
confidence: 99%
See 1 more Smart Citation
“…Efforts using MuJoCo or the robosuite extension [112] include [19,22,29,37,38,48,82,88,96,97,99,108,110,111]. Simulated rigid-body assembly tasks are limited to peg-inhole insertion of large pegs with round, triangular, square, and prismatic cross-sections, lap-joint mating, and one non-convex insertion [29].…”
Section: B Robotic Assembly Simulationmentioning
confidence: 99%
“…Most recent works have used model-free, off-policy RL algorithms or variants, which do not predict environment response, and update an action-value function independently of the current policy (e.g., using a replay buffer). These studies have applied Q-learning [35], deep-Q networks [110], deep deterministic policy gradients (DDPG) [5,58,56,97], softactor critic [7], probabilistic embeddings [82], and hierarchical RL [32]. These algorithms are typically chosen for sample efficiency, but are often brittle and slow/unable to converge.…”
Section: B Robotic Assembly Simulationmentioning
confidence: 99%