2020
DOI: 10.48550/arxiv.2007.09841
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation

Abstract: We introduce a learning-based approach for room navigation using semantic maps. Our proposed architecture learns to predict topdown belief maps of regions that lie beyond the agents field of view while modeling architectural and stylistic regularities in houses. First, we train a model to generate amodal semantic top-down maps indicating beliefs of location, size, and shape of rooms by learning the underlying architectural patterns in houses. Next, we use these maps to predict a Work done while an intern at Fa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 10 publications
(14 citation statements)
references
References 43 publications
0
14
0
Order By: Relevance
“…Liang et al [41] design a semantic scene completion module to complete the unexplored scene. Similar to it, some methods achieve image-level extrapolation of depth and semantics [45], high-dimension feature space extrapolation [46], semantic scene completion [42], [43], attention probability modeling [44], and room-type prediction [47]. We formulate the ObjectNav task as target distance prediction and path planning.…”
Section: B Learning-based Goal Navigation Methodsmentioning
confidence: 99%
“…Liang et al [41] design a semantic scene completion module to complete the unexplored scene. Similar to it, some methods achieve image-level extrapolation of depth and semantics [45], high-dimension feature space extrapolation [46], semantic scene completion [42], [43], attention probability modeling [44], and room-type prediction [47]. We formulate the ObjectNav task as target distance prediction and path planning.…”
Section: B Learning-based Goal Navigation Methodsmentioning
confidence: 99%
“…[55] uses Graph Convolutional Networks to incorporate prior semantic knowledge in a deep reinforcement learning framework, while [14] learns semantic associations by defining a topological map over the 2D scene. Finally, [38] learns to predict room layouts in unobserved regions of the map in an attempt to model architectural regularities in houses for the task of room navigation. In contrast to all these works we formulate an active, target-independent strategy to predict semantic maps and define goal selection objectives.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, the semantic priors are usually encoded implicitly by goal oriented navigation policy functions [14] and are, thus, target-dependent. More relevant to ours, other works have introduced spatial prediction models that either anticipate occupancy [42] or room layouts [38] beyond the agent's field of view and demonstrated improved performance on navigation tasks. Our work differs from these methods in three principled ways: 1) We formulate an active training strategy for learning the semantic maps, 2) we exploit the uncertainty over the predictions in the planning process, and 3) while occupancy anticipation mainly learns to extend what it already sees, applying the same concept to semantics is not trivial.…”
Section: Introductionmentioning
confidence: 96%
“…with as few steps as possible). The internal model can be in forms like a topological graph map [51], semantic map [52], occupancy map [53] or spatial memory [54,55]. These map-based architectures can capture geometry and semantics, allowing for more efficient policy learning and planning [53] as compared to reactive and recurrent neural network policies [56].…”
Section: Visual Explorationmentioning
confidence: 99%