2019
DOI: 10.48550/arxiv.1905.13358
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multi-modal Discriminative Model for Vision-and-Language Navigation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…Our work constructs a model that can be useful for evaluating and enhancing instruction-generation models. Huang et al (2019) and Zhao et al (2021) train LSTM-based discriminative models with contrastive learning to score instructions. We follow a similar approach but focus on identifying word-level hallucinations, and effectively leverage a large pre-trained Transformer model.…”
Section: Related Workmentioning
confidence: 99%
“…Our work constructs a model that can be useful for evaluating and enhancing instruction-generation models. Huang et al (2019) and Zhao et al (2021) train LSTM-based discriminative models with contrastive learning to score instructions. We follow a similar approach but focus on identifying word-level hallucinations, and effectively leverage a large pre-trained Transformer model.…”
Section: Related Workmentioning
confidence: 99%
“…Research on the VLN task has made significant progress in the past few years. Attention mechanisms across different modalities are widely used to learn an alignment between vision and language, to boost the performance of this task [1], [7], [11]- [16]. Except for the new modelling architectures, the improvements also come from new learning approaches.…”
Section: Related Workmentioning
confidence: 99%