2021
DOI: 10.3390/electronics10192380
|View full text |Cite
|
Sign up to set email alerts
|

Optimization of Action Recognition Model Based on Multi-Task Learning and Boundary Gradient

Abstract: Recently, people’s demand for action recognition has extended from the initial high classification accuracy to the high accuracy of the temporal action detection. It is challenging to meet the two requirements simultaneously. The key to behavior recognition lies in the quantity and quality of the extracted features. In this paper, a two-stream convolutional network is used. A three-dimensional convolutional neural network (3D-CNN) is used to extract spatiotemporal features from the consecutive frames. A two-di… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 26 publications
0
3
0
Order By: Relevance
“…They applied a unique-match restriction among the descriptors and eliminated matches that were too far apart from one another to assure correctness. For picture classification, dense sampling has proven to perform better than sparse interest spots (4) . Similarly, dense sampling at predictable geographical and temporal places fared better than cutting-edge space-time interest point detectors in recent assessments of action identification.…”
Section: Methodsmentioning
confidence: 99%
“…They applied a unique-match restriction among the descriptors and eliminated matches that were too far apart from one another to assure correctness. For picture classification, dense sampling has proven to perform better than sparse interest spots (4) . Similarly, dense sampling at predictable geographical and temporal places fared better than cutting-edge space-time interest point detectors in recent assessments of action identification.…”
Section: Methodsmentioning
confidence: 99%
“…[16] uses human boxes and key points to represent instance-level features, and the action region features of this framework are used as the input of the temporal action head network, which makes the framework more discriminative. The author of [17] proposed a multi-scale feature extraction method used to extract richer feature information. At the same time, a multi-task learning model is introduced.…”
Section: Action Recognitionmentioning
confidence: 99%
“…Convolutional neural networks (CNNs) are nowadays the state-of-the-art methods for a wide range of computer vision tasks, thanks to the large-scale public datasets [1][2][3] and high performance accelerators like graphical processing units (GPUs). For example, CNNs rank the highest in benchmarks in image classification [4][5][6], object detection [7][8][9][10], semantic segmentation [11], and action recognitions [12][13][14]. The general recipe for a successful CNN model includes training a large-sized model with a large-scale dataset.…”
Section: Introductionmentioning
confidence: 99%
“…Collecting a large-scale dataset is also very expensive. To solve these issues, previous works have proposed multi-task learning (MTL) [12,[15][16][17]. By definition, multi-task learning trains a model with multiple functionalities.…”
Section: Introductionmentioning
confidence: 99%