In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.
Recently, the information analysis technology of underwater has developed rapidly, which is beneficial to underwater resource exploration, underwater aquaculture, etc. Dangerous and laborious manual work is replaced by deep learning-based computer vision technology, which has gradually become the mainstream. The binocular cameras based visual analysis method can not only collect seabed images but also construct the 3D scene information. The parallax of the binocular image was used to calculate the depth information of the underwater object. A binocular camera based refined analysis method for underwater creature body length estimation was constructed. A fully convolutional network (FCN) was used to segment the corresponding underwater object in the image to obtain the object position. A fish’s body direction estimation algorithm is proposed according to the segmentation image. The semi-global block matching (SGBM) algorithm was used to calculate the depth of the object region and estimate the object body length according to the left and right views of the object. The algorithm has certain advantages in time and accuracy for interest object analysis by the combination of FCN and SGBM. Experiment results show that this method effectively reduces unnecessary information, improves efficiency and accuracy compared to the original SGBM algorithm.
The divergent nickel(0)-catalyzed hydrosilylation/cyclization of 1,6-enynes has been developed, providing an efficient synthetic route for vinyl silanes or alkyl silanes from the same starting materials.
In the practical scene, object detection faces a very complicated situation. The occlusion problem always occurs in actual scene, which may affect the accuracy of object detection, especially for the occluded objects. For the deep models, a larger dataset with sufficient occlusion samples will improve the performance of the object detection models. However, the sample with occlusion problem is too hard to obtain. Therefore, a global average pooling(GAP) based adversarial Faster-RCNN is proposed to generate the hard samples and enhance the performance of object detection algorithm. Sufficient hard samples can be generated with the help of this model. Therefore, the object detection model can be trained adequately for the occluded objects. The hard sample generation is carried out in the space of image feature instead of image generation directly. The class-dependent part is obtained by the GAP network, and it is obscured to generate the feature map of hard sample for model reinforcement training. Therefore, the better object detection model can be trained using a conventional dataset. The Faster-RCNN is adopted as the baseline. The Faster-RCNN and GAP have a joint training to improve the performance of the proposed model. The simulation results exhibit the validation of the proposed algorithm. INDEX TERMS Faster-RCNN, object detection, occlusion problem-oriented, global average pooling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.