2020
DOI: 10.1109/tmm.2019.2932564
|View full text |Cite
|
Sign up to set email alerts
|

Deep Multi-Kernel Convolutional LSTM Networks and an Attention-Based Mechanism for Videos

Abstract: Action recognition greatly benefits motion understanding in video analysis. Recurrent networks such as long short-term memory (LSTM) networks are a popular choice for motion-aware sequence learning tasks. Recently, a convolutional extension of LSTM was proposed, in which input-to-hidden and hidden-to-hidden transitions are modeled through convolution with a single kernel. This implies an unavoidable trade-off between effectiveness and efficiency. Herein, we propose a new enhancement to convolutional LSTM netwo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(13 citation statements)
references
References 26 publications
0
13
0
Order By: Relevance
“…Mathematical formulation of ConvLSTM is shown in Eqs. (19 )– (23) , where ‘*’ is Convolutional operator and ‘ · ’ represents Hadamard product [ 34, 35 ].
Fig.
…”
Section: Methodsmentioning
confidence: 99%
“…Mathematical formulation of ConvLSTM is shown in Eqs. (19 )– (23) , where ‘*’ is Convolutional operator and ‘ · ’ represents Hadamard product [ 34, 35 ].
Fig.
…”
Section: Methodsmentioning
confidence: 99%
“…where • means hadamard product, and W means weight [4,11,17]. Deep fully connected-LSTM (DFC-LSTM), which is a multivariate type of LSTM that uses multiple LSTM layers, was applied in this study as it is easy to learn with the FC layer, and learns temporal features [18,19].…”
Section: Dfc-lstmmentioning
confidence: 99%
“…In CLSTM, the matrix product "•" of the LSTM is a substituted convolutional operation " * ". The equation of CLSTM is shown in [4,11,17,19].…”
Section: Dclstmmentioning
confidence: 99%
“…Advanced methods for temporal integration use particular neural network architectures, such as Convolution Neural Network (CNN) [58] or Recurrent Neural Network (RNN) [52]. More particularly, LSTM architectures are often chosen for motion-aware sequence learning tasks, which is beneficial for activity recognition [2,4]. Attention models are also harnessed to better integrate spatio-temporal information.…”
Section: Related Workmentioning
confidence: 99%