2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00262
|View full text |Cite
|
Sign up to set email alerts
|

VideoMem: Constructing, Analyzing, Predicting Short-Term and Long-Term Video Memorability

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
47
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(47 citation statements)
references
References 25 publications
0
47
0
Order By: Relevance
“…From Table IV we can see that our audio gestalt based approach outperforms all other approaches except SemanticMemNet [32]-the model introduced alongside the Memento10k dataset. [10] 0.552 Cohendet et al (ResNet3D) [10] 0.574 Feature Extraction + Regression (as in [43]) 0.615 SemanticMemNet [32] 0.663 Audio Gestalt 0.618* With respect to our results in Table III, the general trend for predicting video recognition memorability seems to be that the more modalities used, the better the predictions. Even the addition of a poorly-performing individual audio model (0.2913) with a better-performing individual visual model (0.4808), produces an increase in performance (0.4992).…”
Section: Resultsmentioning
confidence: 62%
See 1 more Smart Citation
“…From Table IV we can see that our audio gestalt based approach outperforms all other approaches except SemanticMemNet [32]-the model introduced alongside the Memento10k dataset. [10] 0.552 Cohendet et al (ResNet3D) [10] 0.574 Feature Extraction + Regression (as in [43]) 0.615 SemanticMemNet [32] 0.663 Audio Gestalt 0.618* With respect to our results in Table III, the general trend for predicting video recognition memorability seems to be that the more modalities used, the better the predictions. Even the addition of a poorly-performing individual audio model (0.2913) with a better-performing individual visual model (0.4808), produces an increase in performance (0.4992).…”
Section: Resultsmentioning
confidence: 62%
“…The number of objects depicted in an image does not appear to directly relate to its overall recognition memorability [9], and likewise with properties such as aesthetics and interestingness [1]. However, combinations of semantically based attributes, such as object/scene category, emotion or actions, are predictive of recognition memorability [10]. Additionally, scrambled images retain consistencies in recognition memorability, but only for short time periods (seconds).…”
Section: A Visual Memorabilitymentioning
confidence: 98%
“…Furthermore, every video must also have a memorability score attached indicating its degree of memorability. Amidst the corpora accessible by the research community that meet our criteria, two in particular stand out: VideoMem and Memento10K [19,20]. Although it is true that data are extracted from different sources and their annotation procedures are not exactly equal, in this study we shall consider that both sets of labels represent the same concept closely enough to model them following the very same principles.…”
Section: Datasetsmentioning
confidence: 99%
“…Videos are never presented in the same fixed order, but randomly. Because it has been observed that video memorability depends in a linear way on the number of videos between two occurrences [19,20], in order to homogenize memorability labels among videos a linear correction (firstly introduced in [16]) is applied to the raw hit scores, thus obtaining the final set of labels.…”
Section: Videomemmentioning
confidence: 99%
“…The second point, closely related to the previous one, is the lack of a common definition for VM. Regarding modelling, previous attempts at predicting VM [3,12] have highlighted several features which contribute to the prediction of VM, such as semantic, saliency and colour features, but the work is far from complete and our capacity to propose effective computational models will help to meet the challenge of VM prediction. The goal of this task is to participate in the harmonisation and the advancement of this emerging multimedia field.…”
Section: Introductionmentioning
confidence: 99%