Generally, people do various things while walking. For example, people frequently walk while looking at their smartphones. Sometimes we walk differently than usual; for example, when walking on ice or snow, we tend to waddle. Understanding walking patterns could provide users with contextual information tailored to the current situation. To formulate this as a machine-learning problem, we defined 18 different everyday walking styles. Noting that walking strategies significantly affect the spatiotemporal features of hand motions, e.g., the speed and intensity of the swinging arm, we propose a smartwatch-based wearable system that can recognize these predefined walking styles. We developed a wearable system, suitable for use with a commercial smartwatch, that can capture hand motions in the form of multivariate timeseries (MTS) signals. Then, we employed a set of machine learning algorithms, including feature-based and recent deep learning algorithms, to learn the MTS data in a supervised fashion. Experimental results demonstrated that, with recent deep learning algorithms, the proposed approach successfully recognized a variety of walking patterns, using the smartwatch measurements. We analyzed the results with recent attention-based recurrent neural networks to understand the relative contributions of the MTS signals in the classification process.
Previous research on 3D skeleton-based human action recognition has frequently relied on a sequence-wise viewpoint normalization process, which adjusts the view directions of all segmented action sequences. This type of approach typically demonstrates robustness against variations in viewpoint found in short-term videos, a characteristic commonly encountered in public datasets. However, our preliminary investigation of complex action sequences, such as discussions or smoking, reveals its limitations in capturing the intricacies of such actions. To address these view-dependency issues, we propose a straightforward, yet effective, sequence-wise augmentation technique. This strategy enhances the robustness of action recognition models, particularly against changes in viewing direction that mainly occur within the horizontal plane (azimuth) by rotating human key points around either the z-axis or the spine vector, effectively creating variations in viewing directions. We scrutinize the robustness of this approach against real-world viewpoint variations through extensive empirical studies on multiple public datasets, including an additional set of custom action sequences. Despite the simplicity of our approach, our experimental results consistently yield improved action recognition accuracies. Compared to the sequence-wise viewpoint normalization method used with advanced deep learning models like Conv1D, LSTM, and Transformer, our approach showed a relative increase in accuracy of 34.42% for the z-axis and 10.86% for the spine vector.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.