2024
DOI: 10.1609/aaai.v38i7.28597
|View full text |Cite
|
Sign up to set email alerts
|

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

Gengze Zhou,
Yicong Hong,
Qi Wu

Abstract: Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such a trend underscored the potential of training LLMs with unlimited language data, advancing the development of a universal embodied agent. In this work, we introduce the NavGPT, a purely LLM-based instruction-following navigation agent, to reveal the reasoning capability of GPT models in complex embodied scenes by performing zero-shot … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 21 publications
references
References 43 publications
0
0
0
Order By: Relevance