2023
DOI: 10.15625/1813-9663/18157
|View full text |Cite
|
Sign up to set email alerts
|

Evjvqa Challenge: Multilingual Visual Question Answering

Ngan Luu-Thuy Nguyen,
Nghia Hieu Nguyen,
Duong T.D. Vo
et al.

Abstract: Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers. English is a resource-rich language that has witnessed various developments in datasets and models for visual question answering. Visual question answering in other languages also would be developed for resources and models. In addition, there is no multilingual dataset targeting the visual content of a particular country with its own objects an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 22 publications
0
3
0
Order By: Relevance
“…Consistent with the preceding research conducted by [7,19,22,34], elucidating the evaluation metrics utilized for gauging the model's efficacy is crucial before delving into the analysis of the experimental outcomes. The appraisal in this research encompasses four pivotal performance metrics: F1 score, Precision, Recall, and Accuracy.…”
Section: Evaluation Metricsmentioning
confidence: 79%
See 1 more Smart Citation
“…Consistent with the preceding research conducted by [7,19,22,34], elucidating the evaluation metrics utilized for gauging the model's efficacy is crucial before delving into the analysis of the experimental outcomes. The appraisal in this research encompasses four pivotal performance metrics: F1 score, Precision, Recall, and Accuracy.…”
Section: Evaluation Metricsmentioning
confidence: 79%
“…Building on these foundational steps, Nguyen et al in 2022 [34] broke new ground by unveiling a multilingual dataset through a shared task. Notably, this dataset incorporates the Vietnamese language, broadening the scope of VQA research to delve into the Vietnamese linguistic setting.…”
Section: Related Workmentioning
confidence: 99%
“…In our experiment, we used UIT-EVJVQA [26], the first mVQA dataset with three languages, including English and Vietnamese released by VLSP-2022 Organizers for EVJVQA challenge (https://vlsp.org.vn/vlsp2022/eval/evjvqa). This dataset includes question-answer pairs created by humans on a set of images taken in Vietnam, with the answer created from the input question and the corresponding image.…”
Section: Datasetmentioning
confidence: 99%