2022
DOI: 10.48550/arxiv.2203.10421
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
8
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(8 citation statements)
references
References 0 publications
0
8
0
Order By: Relevance
“…Zero-shot Models. The recent success of large pretrained vision and language models [10], [27] has spurred a flurry of interest in applying their zero-sot capabilities to different domains including object detection and segmentation [28], [29], [11], robot manipulation [30], [31], [32], [33], and navigation [13], [12], [34]. Most related to our work is the approach denoted LM-Nav [13], which combines three pre-trained models to navigate via a topological graph in the real world.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Zero-shot Models. The recent success of large pretrained vision and language models [10], [27] has spurred a flurry of interest in applying their zero-sot capabilities to different domains including object detection and segmentation [28], [29], [11], robot manipulation [30], [31], [32], [33], and navigation [13], [12], [34]. Most related to our work is the approach denoted LM-Nav [13], which combines three pre-trained models to navigate via a topological graph in the real world.…”
Section: Related Workmentioning
confidence: 99%
“…Most related to our work is the approach denoted LM-Nav [13], which combines three pre-trained models to navigate via a topological graph in the real world. CoW [12] performs zero-shot language-based object navigation by combining CLIP-based [10] saliency maps and traditional exploration methods. However, both LM-Nav [13] and CoW [12] are limited to navigating to object landmarks and are less capable to understand finer-grained queries, such as "to the left of the chair" and "in between the TV and the sofa".…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations