2016
DOI: 10.1007/978-1-4939-3435-5_16
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Saliency Models for Videos

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(11 citation statements)
references
References 58 publications
0
11
0
Order By: Relevance
“…A. Setup 1) Datasets: The proposed method is trained and evaluated on AVAD [53], Coutrot1 [68], [69], Coutrot2 [68], [69], DIEM [70], ETMD [71], [72] and SumMe [72], [73] datasets. These datasets contains various types videos accompanied with audios.…”
Section: Experiments and Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…A. Setup 1) Datasets: The proposed method is trained and evaluated on AVAD [53], Coutrot1 [68], [69], Coutrot2 [68], [69], DIEM [70], ETMD [71], [72] and SumMe [72], [73] datasets. These datasets contains various types videos accompanied with audios.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…The dataset also contains the eye-tracking data from 16 participants. 2) The Coutrot1 and Coutrot2 datasets are split from the Coutrot dataset [68], [69]. The Coutrot1 dataset is with 60 video clips covering 4 visual categories: one moving object, several moving objects, landscapes, and faces.…”
Section: Experiments and Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…At the same time, we evaluate our model on six audio-video saliency datasets: DIEM [30], Coutrot1 [11][12], Coutrot2 [11] [12], AVAD [29], ETMD [21], SumMe [16].…”
Section: Datasetsmentioning
confidence: 99%
“…Of more consequence is the lack of a model for computation of audiovisual saliency in complex video sequences. Existing literature for audio-video saliency modeling is scarce and often targets a specific class of videos [10], [27], [28]. Therefore, an extended saliency model to predict salient regions in complex videos with different sound classes is required.…”
Section: Introductionmentioning
confidence: 99%