In this paper, we propose a human action recognition method using HOIRM (histogram of oriented interest region motion) feature fusion and a BOW (bag of words) model based on AP (affinity propagation) clustering. First, a HOIRM feature extraction method based on spatiotemporal interest points ROI is proposed. HOIRM can be regarded as a middle-level feature between local and global features. Then, HOIRM is fused with 3D HOG and 3D HOF local features using a cumulative histogram. The method further improves the robustness of local features to camera view angle and distance variations in complex scenes, which in turn improves the correct rate of action recognition. Finally, a BOW model based on AP clustering is proposed and applied to action classification. It obtains the appropriate visual dictionary capacity and achieves better clustering effect for the joint description of a variety of features. The experimental results demonstrate that by using the fused features with the proposed BOW model, the average recognition rate is 95.75% in the KTH database, and 88.25% in the UCF database, which are both higher than those by using only 3D HOG+3D HOF or HOIRM features. Moreover, the average recognition rate achieved by the proposed method in the two databases is higher than that obtained by other methods.
Judging the maturity level of each hand-wrist reference bone is the core issue in bone age assessment. Relying on the superiority of convolutional neural networks in feature representation, deep learning is widely studied for the automatic bone age assessment. However, an efficient but complex deep learning network requests a large dataset with bone-maturitylevel labels for training, restricting its large-scale application in bone maturity classification. For this reason, we transform the bone-maturity-level classification problem into the similarity matching problem. Also, we propose a general structure based on Siamese network by merging two inputs into a two-channel input and introducing a dual attention mechanism, to create an Attentional Two-Channel Network (ATC-Net). This paper takes the intermediate phalanges III as an example to assess the performance of the similarity matching method and the ATC-Net. Experiments show that our method can perform better on small datasets, which effectively makes up for the data shortage problem. The ATC-Net used for classification significantly reduces the evaluation time compared with other classical networks. It reduces the time of assessing one sample by about 49% as compared to VGG-16. And more importantly, it achieves the highest classification accuracy of 92.74% among all investigated networks.
Synthetic aperture radar (SAR) multi‐target interactive motion recognition classifies the type of interactive motion and generates descriptions of the interactive motions at the semantic level by considering the relevance of multi‐target motions. A method for SAR multi‐target interactive motion recognition is proposed, which includes moving target detection, target type recognition, interactive motion feature extraction, and multi‐target interactive motion type recognition. Wavelet thresholding denoising combined with a convolutional neural network (CNN) is proposed for target type recognition. The method performs wavelet thresholding denoising on SAR target images and then uses an eight‐layer CNN named EilNet to achieve target recognition. After target type recognition, a multi‐target interactive motion type recognition method is proposed. A motion feature matrix is constructed for recognition and a four‐layer CNN named FolNet is designed to perform interactive motion type recognition. A motion simulation dataset based on the MSTAR dataset is built, which includes four kinds of interactive motions by two moving targets. The experimental results show that the recognition performance of the authors’ Wavelet + EilNet method for target type recognition and FolNet for multi‐target interactive motion type recognition are both better than other methods. Thus, the proposed method is an effective method for SAR multi‐target interactive motion recognition.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.