2021
DOI: 10.1109/tcsvt.2021.3051277
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Video Question-Answer Generation With Generator-Pretester Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(6 citation statements)
references
References 59 publications
0
6
0
Order By: Relevance
“…Video QG is an upcoming area, where some works have been introduced lately. In [ 128 ], using a newly constructed architecture of a generator, which generates a question given a video clip and an answer, and a pre-tester, which tries to answer the generated question, has been investigated for joint QA-QG from videos. They make use of Video encoder for extracting video features of 20 frames from each video and later encode them using faster R-CNN and Resnet101.…”
Section: Challenges and Future Directionsmentioning
confidence: 99%
“…Video QG is an upcoming area, where some works have been introduced lately. In [ 128 ], using a newly constructed architecture of a generator, which generates a question given a video clip and an answer, and a pre-tester, which tries to answer the generated question, has been investigated for joint QA-QG from videos. They make use of Video encoder for extracting video features of 20 frames from each video and later encode them using faster R-CNN and Resnet101.…”
Section: Challenges and Future Directionsmentioning
confidence: 99%
“…Some works explore this task in text using techniques such as pipeline (Subramanian et al, 2018), multi-agent system (Wang et al, 2019), hierarchical variational model (Subramanian et al, 2018) or coreference knowledge (Lee et al, 2020). Su et al (2021) also proposes a model for QAP generation from video. However, such QAP generation works assume answers are selected from the spans of input context (Subramanian et al, 2018;Lee et al, 2020;Wang et al, 2019) or the given candidates (Su et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…Su et al (2021) also proposes a model for QAP generation from video. However, such QAP generation works assume answers are selected from the spans of input context (Subramanian et al, 2018;Lee et al, 2020;Wang et al, 2019) or the given candidates (Su et al, 2021). As answers could not be extracted directly from images and there are no candidate ones, the above methods can not be simply applied to the image.…”
Section: Related Workmentioning
confidence: 99%
“…answer [1], [2], [3], [4]. Further, video captioning uses the relational modals to generate captions for related videos, it has recently shifted to dense video captioning and produces many captions simultaneously as compared to one caption [5].…”
Section: Introductionmentioning
confidence: 99%