2020
DOI: 10.1007/978-3-030-60457-8_1
|View full text |Cite
|
Sign up to set email alerts
|

DCA: Diversified Co-attention Towards Informative Live Video Commenting

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 23 publications
0
8
0
Order By: Relevance
“…We note that reference-based metrics for generation tasks like BLEU and ROUGE are not suitable for evaluation of video comments (Das et al, 2017;Ma et al, 2019;Zhang et al, 2020). Hence we follow (Das et al, 2017) and focus on the ability to rank the correct comment originally appearing at this point in the video over other comments taken from the dataset.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…We note that reference-based metrics for generation tasks like BLEU and ROUGE are not suitable for evaluation of video comments (Das et al, 2017;Ma et al, 2019;Zhang et al, 2020). Hence we follow (Das et al, 2017) and focus on the ability to rank the correct comment originally appearing at this point in the video over other comments taken from the dataset.…”
Section: Discussionmentioning
confidence: 99%
“…Specifically, we use the code from (Wu et al, 2020), trained on our full dataset with only video frames and surrounding comments as input. The models proposed in (Chaoqun et al, 2020;Zhang et al, 2020) are very recent and their code is not publicly available yet, so we do not consider these as one of our baseline methods. Other older neural architectures such as LSTM are also not included in this study since it is well established that Transformers are the method of choice for modelling multi-modal signals.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Previous studies incorporate auxiliary knowledge source like scene graphs or object tags to explicitly indicate the cross-modal mapping. Other researches try to establish fine-grained interaction on cross-modal attention to reinforce the focus from words to their most relevant regions, and vice versa Wang et al, 2019;Messina et al, 2020;Lee et al, 2018;Zhang et al, 2020b;.…”
Section: Self-attention Attention Distributionmentioning
confidence: 99%
“…In previous work, we introduced open implementation of Livebot [21] which addresses a number of shortcomings in the original implementation. Follow up work has recently looked at extending the Livebot architecture with different multi-modal fusion approaches [3,27].…”
Section: Related Work 21 Automated Danmu Commentingmentioning
confidence: 99%