2019
DOI: 10.1007/978-981-13-6834-9_12
|View full text |Cite
|
Sign up to set email alerts
|

Research on Human Behavior Recognition Based on Convolutional Neural Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 7 publications
0
3
0
Order By: Relevance
“…At the same time, in order to help the system obtain occlusion target information faster, channel attention should be introduced before each step of pooling operation, to provide convolution information for all pixels, enhance the expression ability of useful information, and use multilevel channel attention to achieve feature fusion, so as to provide effective basis for global semantic information and local detail information, and finally obtain better segmentation results. [10][11][12][13] Based on the analysis of the network structure design shown in Table 1 below, when the RGB three-channel image with 256×256 pixels is input for downsampling operation, the downsampling operation is repeated according to the above rules after two layers of convolution operation with the convolution kernel size of 3×3 and step size of 1, batch normalization layer, attention module and maximum pooling layer, etc. Gradually increase the size of the feature map and reduce the depth of the feature layer, use the attention features contained in the downsampling and the same scale information of the up-sampling to achieve feature stitching, and repeat the up-sampling operation in accordance with the basic provisions, and finally get the 256×256×64 feature map.…”
Section: Remote Sensing Image Segmentation Model Based On Multi-level...mentioning
confidence: 99%
“…At the same time, in order to help the system obtain occlusion target information faster, channel attention should be introduced before each step of pooling operation, to provide convolution information for all pixels, enhance the expression ability of useful information, and use multilevel channel attention to achieve feature fusion, so as to provide effective basis for global semantic information and local detail information, and finally obtain better segmentation results. [10][11][12][13] Based on the analysis of the network structure design shown in Table 1 below, when the RGB three-channel image with 256×256 pixels is input for downsampling operation, the downsampling operation is repeated according to the above rules after two layers of convolution operation with the convolution kernel size of 3×3 and step size of 1, batch normalization layer, attention module and maximum pooling layer, etc. Gradually increase the size of the feature map and reduce the depth of the feature layer, use the attention features contained in the downsampling and the same scale information of the up-sampling to achieve feature stitching, and repeat the up-sampling operation in accordance with the basic provisions, and finally get the 256×256×64 feature map.…”
Section: Remote Sensing Image Segmentation Model Based On Multi-level...mentioning
confidence: 99%
“…The neural network model used in this experiment is a multi-segmental Two-Stream Convolutional neural network model [14]. In the design process, the Convolutional Two-Stream network model proposed in the literature is referred to and the content of this study is adjusted.…”
Section: Detailed Design and Code Implementation Of Video Behavior Recognition Modelmentioning
confidence: 99%
“…Compared to image information, skeleton information is more suitable for classroom research as it has no background or light interference. Previously, many researchers [1,2] used skeleton points to identify students' behavior and analyze classroom participation, but all remained in the single image stage. Recently, many studies [3,4] use streaming data as the analysis object and focus on mining the spatiotemporal features in the data, and the proposed methods have effectively utilized the temporal information.…”
Section: Introductionmentioning
confidence: 99%