2023
DOI: 10.1609/aaai.v37i3.25484
|View full text |Cite
|
Sign up to set email alerts
|

Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning

Abstract: Video captioning aims to generate natural language sentences that describe the given video accurately. Existing methods obtain favorable generation by exploring richer visual representations in encode phase or improving the decoding ability. However, the long-tailed problem hinders these attempts at low-frequency tokens, which rarely occur but carry critical semantics, playing a vital role in the detailed generation. In this paper, we introduce a novel Refined Semantic enhancement method towards Frequency Diff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 19 publications
references
References 27 publications
0
0
0
Order By: Relevance