2019
DOI: 10.48550/arxiv.1907.08352
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Representation Learning for Classical Planning from Partially Observed Traces

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 22 publications
0
3
0
Order By: Relevance
“…Also, similar to other variants of A* search, our approach is not suitable for problems with a high-dimensional search space. To overcome such limitations, one interesting direction for future work is to extend the proposed approach to planning on general graphs as done by Yang et al (2020); Xiao et al (2019) and with a suitable encoder for high-dimensional spaces (Qureshi et al 2019).…”
Section: Resultsmentioning
confidence: 99%
“…Also, similar to other variants of A* search, our approach is not suitable for problems with a high-dimensional search space. To overcome such limitations, one interesting direction for future work is to extend the proposed approach to planning on general graphs as done by Yang et al (2020); Xiao et al (2019) and with a suitable encoder for high-dimensional spaces (Qureshi et al 2019).…”
Section: Resultsmentioning
confidence: 99%
“…π‘‘π‘œπ‘›π‘’ ← true Upon reaching a state from which a plan to the goal exists, the agent stops exploring. For each of its subgoal learners β„“ σ𝑠𝑔 ∈ 𝐿, it attempts to construct sets of preconditions (characterized as a partial fluent state Οƒπ‘π‘Ÿπ‘’ ) from which its policy can consistently achieve the subgoal state σ𝑠𝑔 (see Section 3.4) (lines [28][29][30][31][32][33][34]. If any such precondition sets Οƒπ‘π‘Ÿπ‘’ exist, the agent constructs the operator π‘œ β˜… such that π‘π‘Ÿπ‘’ (π‘œ β˜… ) = Οƒπ‘π‘Ÿπ‘’ , eff (π‘œ β˜… ) = σ𝑠𝑔 , and without static fluents (all other variables are unknown once the operator is executed; π‘ π‘‘π‘Žπ‘‘π‘–π‘ (π‘œ β˜… ) = βˆ…).…”
Section: Learning Operator Policiesmentioning
confidence: 99%
“…The examples consist of successful fluent state and operator pairs corresponding to a sequence of transitions in the symbolic domain [35]. Subsequent work has explored how domains can be learned with partial knowledge of successful traces [2,6], and with neural networks capable of approximating from partial traces [33] and learning models from pixels [4,7].…”
Section: Learning Symbolic Action Modelsmentioning
confidence: 99%