2021
DOI: 10.1155/2021/7799100
|View full text |Cite
|
Sign up to set email alerts
|

Micro Expression Recognition via Dual-Stream Spatiotemporal Attention Network

Abstract: Microexpression can manifest the real mood of humans, which has been widely concerned in clinical diagnosis and depression analysis. To solve the problem of missing discriminative spatiotemporal features in a small data set caused by the short duration and subtle movement changes of microexpression, we present a dual-stream spatiotemporal attention network (DSTAN) that integrates dual-stream spatiotemporal network and attention mechanism to capture the deformation features and spatiotemporal features of microe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 29 publications
0
12
0
Order By: Relevance
“…Moreover, Yao et al [174] learned the weights of each feature channel adaptively through adding squeezeand-andexcitation blocks. Additionally, recent works [56], [82], [83], [175] encoded the spatio-temporal and channel attention simultaneously to further boost the representational power of MEs. Specifically, CBAMNet [82] presented a convolutional block attention module (CBAM) cascading the spatial attention module (see Fig.…”
Section: Network Blockmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, Yao et al [174] learned the weights of each feature channel adaptively through adding squeezeand-andexcitation blocks. Additionally, recent works [56], [82], [83], [175] encoded the spatio-temporal and channel attention simultaneously to further boost the representational power of MEs. Specifically, CBAMNet [82] presented a convolutional block attention module (CBAM) cascading the spatial attention module (see Fig.…”
Section: Network Blockmentioning
confidence: 99%
“…For the input, in general, the combined inputs can provide promising results on all of the datasets [70], [94], [175]. This is because the different input modalities can contribute information from different views.…”
Section: Model Evaluation Protocolsmentioning
confidence: 99%
See 1 more Smart Citation
“…Handcrafted-based approaches include LBP-based [6], [15]- [20], optical-flow-based [21]- [24], and more techniques (e.g., colour space, histogram.). Using deep learning to extract additional handcrafted features [25]- [27] or fusing handcrafted features with deep-learning features, respectively [28]. Deep-learning-based may be separated into three categories depending on the input data: input onset frame and apex frame [31], which extract the face texture and light-shadow.…”
Section: Zhao Et Al and 3d Gradient Descriptorsmentioning
confidence: 99%
“…Yang et al [26] and bai et al [32] retrieved spatial features using VGGNet-16 and VGGFace, respectively, and then extracted temporal features using LSTM. Wang et al [27] extracted features from frame sequences and optical flow sequences, respectively, using STAN-A and STMN-A, and then conducted fusion classification. Lei et al [31] extracted characteristics between facial action units using graph convolution (AUs).…”
Section: Zhao Et Al and 3d Gradient Descriptorsmentioning
confidence: 99%