Hierarchical Approaches

Hengst, Bernhard

doi:10.1007/978-3-642-27645-3_9

Cited by 22 publications

(14 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This work highlights the fact that many sequential decision problems contain repeated sets of actions that are related to solving the same subgoals. At a high level, many of the computational approaches for learning which subgoals are useful for a given task involve identifying commonly traversed task states that serve as bottlenecks between large sets of similar states (Hengst, 2012; Tomov, Yagati, Kumar, Yang, & Gershman, 2020). For example, a doorway between two rooms represents a bottleneck linking any state in the first room to any state in the second room.…”

Section: Task Representationsmentioning

confidence: 99%

What Is the Model in Model‐Based Planning?

2021

View full text Add to dashboard Cite

Flexibility is one of the hallmarks of human problem‐solving. In everyday life, people adapt to changes in common tasks with little to no additional training. Much of the existing work on flexibility in human problem‐solving has focused on how people adapt to tasks in new domains by drawing on solutions from previously learned domains. In real‐world tasks, however, humans must generalize across a wide range of within‐domain variation. In this work we argue that representational abstraction plays an important role in such within‐domain generalization. We then explore the nature of this representational abstraction in realistically complex tasks like video games by demonstrating how the same model‐based planning framework produces distinct generalization behaviors under different classes of task representation. Finally, we compare the behavior of agents with these task representations to humans in a series of novel grid‐based video game tasks. Our results provide evidence for the claim that within‐domain flexibility in humans derives from task representations composed of propositional rules written in terms of objects and relational categories.

show abstract

Section: Task Representationsmentioning

confidence: 99%

What Is the Model in Model‐Based Planning?

2021

View full text Add to dashboard Cite

show abstract

“…The famous 'divide-and-conquer' strategy has been practiced for RL research for a few decades, with a number of studies which show that dividing the problem into sub-problems and making abstractions based on them can significantly improve learning performance (Dietterich, 2000;Hengst, 2012). However, it is not always straightforward to devise a meaningful partitioning scheme.…”

Section: Automatic Abstraction In Reinforcement Learningmentioning

confidence: 99%

Automatic landmark discovery for learning agents under partial observability

Demіr

Çilden

Polat

2019

The Knowledge Engineering Review

View full text Add to dashboard Cite

In the reinforcement learning context, a landmark is a compact information which uniquely couples a state, for problems with hidden states. Landmarks are shown to support finding good memoryless policies for Partially Observable Markov Decision Processes (POMDP) which contain at least one landmark. SarsaLandmark, as an adaptation of Sarsa(λ), is known to promise a better learning performance with the assumption that all landmarks of the problem are known in advance. In this paper, we propose a framework built upon SarsaLandmark, which is able to automatically identify landmarks within the problem during learning without sacrificing quality, and requiring no prior information about the problem structure. For this purpose, the framework fuses SarsaLandmark with a well-known multiple-instance learning algorithm, namely Diverse Density (DD). By further experimentation, we also provide a deeper insight into our concept filtering heuristic to accelerate DD, abbreviated as DDCF (Diverse Density with Concept Filtering), which proves itself to be suitable for POMDPs with landmarks. DDCF outperforms its antecedent in terms of computation speed and solution quality without loss of generality. The methods are empirically shown to be effective via extensive experimentation on a number of known and newly introduced problems with hidden state, and the results are discussed.

show abstract

“…Our DMSSPs model share elements with previously studied MDP models: arbitrarily modulated transition functions [8], stochastic shortest paths with online information [9], and factored hybrid-space MDPs [10]. Our HSP algorithm uses ideas from heuristic search [11,12] and search-based planning for multi-step tasks [13,14], approximate dynamic programming [15,6], hierarchical planning for solving large MDPs [16,17,18], and interleaved planning and execution [19,20]. A body of relevant previous work incorporates heuristic search and classical AI techniques in algorithms for solving MDPs [21,22,23].…”

Section: Related Work Overviewmentioning

confidence: 99%

Hybrid Planning for Dynamic Multimodal Stochastic Shortest Paths

Choudhury,

Kochenderfer

2019

Preprint

View full text Add to dashboard Cite

Sequential decision problems in applications such as manipulation in warehouses, multi-step meal preparation, and routing in autonomous vehicle networks often involve reasoning about uncertainty, planning over discrete modes as well as continuous states, and reacting to dynamic updates. To formalize such problems generally, we introduce a class of Markov Decision Processes (MDPs) called Dynamic Multimodal Stochastic Shortest Paths (DMSSPs). Much of the work in these domains solves deterministic variants, which can yield poor results when the uncertainty has downstream effects. We develop a Hybrid Stochastic Planning (HSP) algorithm, which uses domain-agnostic abstractions to efficiently unify heuristic search for planning over discrete modes, approximate dynamic programming for stochastic planning over continuous states, and hierarchical interleaved planning and execution. In the domain of autonomous multimodal routing, HSP obtains significantly higher quality solutions than a state-of-the-art Upper Confidence Trees algorithm and a two-level Receding Horizon Control algorithm.Preprint. Under review.

show abstract

Hierarchical Approaches

Cited by 22 publications

References 29 publications

What Is the Model in Model‐Based Planning?

What Is the Model in Model‐Based Planning?

Automatic landmark discovery for learning agents under partial observability

Hybrid Planning for Dynamic Multimodal Stochastic Shortest Paths

Contact Info

Product

Resources

About