2016 23rd International Conference on Pattern Recognition (ICPR) 2016
DOI: 10.1109/icpr.2016.7900131
|View full text |Cite
|
Sign up to set email alerts
|

DLSTM approach to video modeling with hashing for large-scale video retrieval

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 17 publications
0
4
0
Order By: Relevance
“…There have been several methods that adapted the use of 2D CNNs along with a sequential data processing NN layer(s) in addition to additional losses to obtain a video hashing deep model that hashes in an end-to-end manner [11,14,15,[26][27][28][29][30][31]. The additional sequential data processing NNs are used to obtain temporal features that are not extracted from the CNNs.…”
Section: Content Based Video Retrievalmentioning
confidence: 99%
See 1 more Smart Citation
“…There have been several methods that adapted the use of 2D CNNs along with a sequential data processing NN layer(s) in addition to additional losses to obtain a video hashing deep model that hashes in an end-to-end manner [11,14,15,[26][27][28][29][30][31]. The additional sequential data processing NNs are used to obtain temporal features that are not extracted from the CNNs.…”
Section: Content Based Video Retrievalmentioning
confidence: 99%
“…In [26] after extracting features using VGG19 an attention-based LSTM is used to further process the features then a fully connected (FC) layer to get the hashes. [27] uses differential LSTM (DLSTM) along with a variation of AlexNet to encode the features into hashes.…”
Section: Content Based Video Retrievalmentioning
confidence: 99%
“…Long Short-Term Memory(LSTM) network, a typical type of recurrent neural networks(RNN) architecture, is proposed by Hochreiter et al [16] and widely used in many research tasks. Take video for example, this network has been applied in action recognition [17] [18] [19], video retrieval [20] [21], video segmentation [22] [23] and Video Captioning [24] [25], etc. LSTM-Autoencoder, a typical sequence-to-sequence [26] framework, is proposed by Srivastava et al [17] and applied for learning video action recognition.…”
Section: Introductionmentioning
confidence: 99%
“…Raw frame representations obtained from a CNN were fed to an LSTM, max-pooling, and fully connected layer to attain the fixedlength hash codes. For reducing the feature size to support massive video databases, Zhuang et al [82] proposed using a differential LSTM (DLSTM) [83] for modeling videos. They extract one video segment to generate a highly compact fixed-length representation of the original video.…”
Section: Image Retrievalmentioning
confidence: 99%