2020 American Control Conference (ACC) 2020
DOI: 10.23919/acc45564.2020.9148047
|View full text |Cite
|
Sign up to set email alerts
|

Online, Model-Free Motion Planning in Dynamic Environments: An Intermittent, Finite Horizon Approach with Continuous-Time Q-Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…Although PRM-RL [88] and RL-RRT [89] both contain the high-level sampling-based planners with an RL-based local obstacle-avoiding policy, the latter RL agent can be deployed directly without the need of tuning reward and network structure. To achieve online and model-free kinodynamic planning, [90][91][92], were proposed. Benefiting from the vertex extension and metric role of the AC neural network, the framework can learn the optimal policy without model information to tackle every two-point boundary value problem from the RRTs.…”
Section: Motion Planning With Policy-based Rl Methodsmentioning
confidence: 99%
“…Although PRM-RL [88] and RL-RRT [89] both contain the high-level sampling-based planners with an RL-based local obstacle-avoiding policy, the latter RL agent can be deployed directly without the need of tuning reward and network structure. To achieve online and model-free kinodynamic planning, [90][91][92], were proposed. Benefiting from the vertex extension and metric role of the AC neural network, the framework can learn the optimal policy without model information to tackle every two-point boundary value problem from the RRTs.…”
Section: Motion Planning With Policy-based Rl Methodsmentioning
confidence: 99%
“…The authors of [24] developed a controller with intermittent communication for a multi-agent system, whose dwell-time conditions, needed for stability, as well as safety constraints were expressed by metric temporal logic specifications. Finally, event-triggering mechanisms were employed to the problem of autonomous path planning [25]. Compared to the aforementioned works, and to the best of our knowledge, this is the first time that an optimal, safe, and intermittent learning framework combined with formal methods is used in a continuous-time framework for tracking a family of trajectories.…”
Section: Related Workmentioning
confidence: 99%
“…We employ another approximating structure, called an actor, to approximate the intermittent controller (25). This is expressed, @t P pr j , r j`1 s, as u ‹ pŝ aug q " θ ‹ u T φ u pŝ aug q ` u pŝ aug q, @ŝ aug ,…”
Section: B Actor Approximatormentioning
confidence: 99%