2019
DOI: 10.1007/978-3-030-20887-5_36
|View full text |Cite
|
Sign up to set email alerts
|

Fast Video Shot Transition Localization with Deep Structured Models

Abstract: Detection of video shot transition is a crucial pre-processing step in video analysis. Previous studies are restricted on detecting sudden content changes between frames through similarity measurement and multi-scale operations are widely utilized to deal with transitions of various lengths. However, localization of gradual transitions are still underexplored due to the high visual similarity between adjacent frames. Cut shot transitions are abrupt semantic breaks while gradual shot transitions contain low-lev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
27
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(27 citation statements)
references
References 29 publications
0
27
0
Order By: Relevance
“…Chikashi et al [7] proposed a query method based on trajectory extraction and encoding relationships between objects. In [2], [37], authors proposed a pre-trained Convolutional Neural Network (CNN) to detect objects in video frames. For query representation, they proposed complex linguistic rules for extracting relevant parts from video data.…”
Section: B Video Retrieval Systems From Text Querymentioning
confidence: 99%
See 1 more Smart Citation
“…Chikashi et al [7] proposed a query method based on trajectory extraction and encoding relationships between objects. In [2], [37], authors proposed a pre-trained Convolutional Neural Network (CNN) to detect objects in video frames. For query representation, they proposed complex linguistic rules for extracting relevant parts from video data.…”
Section: B Video Retrieval Systems From Text Querymentioning
confidence: 99%
“…However, this makes it very slow to be used in real-time [1] due to working with the high dimensional vectors. Then, the system returns the images that most closely resemble the query image [2].…”
Section: Introductionmentioning
confidence: 99%
“…Another well-suited place to clip a video sequence is at a scene change (or a "shot boundary") where, for example, the camera angle changes. For this task, we propose to use TransNet V2 [10]: a state-of-the-art scalable architecture for scene boundary detection that has achieved superb performance on shot-boundary datasets, such as ClipShots [11], RAI [36], and BBC [37]. The network takes a sequence of consecutive video frames and uses a series of convolutions together with handcrafted image features.…”
Section: Scene Boundary Detectionmentioning
confidence: 99%
“…The features are concatenated, and then the system returns a prediction for every frame in the input [10,38]. The TransNet model is pre-trained on transitions extracted from ClipShots [11] and the TRECVid IACC.3 [12] datasets. We propose to compare this pre-trained model with a model trained from scratch on different soccer video clips and identify the influencing factors on performance so that an optimal scene boundary detection model can be integrated into our overall pipeline.…”
Section: Scene Boundary Detectionmentioning
confidence: 99%
See 1 more Smart Citation