2022
DOI: 10.1109/access.2022.3140373
|View full text |Cite
|
Sign up to set email alerts
|

ConvAE-LSTM: Convolutional Autoencoder Long Short-Term Memory Network for Smartphone-Based Human Activity Recognition

Abstract: The self-regulated recognition of human activities from time-series smartphone sensor data is a growing research area in smart and intelligent health care. Deep learning (DL) approaches have exhibited improvements over traditional machine learning (ML) models in various domains, including human activity recognition (HAR). Several issues are involved with traditional ML approaches; these include handcrafted feature extraction, which is a tedious and complex task involving expert domain knowledge, and the use of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
27
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(29 citation statements)
references
References 95 publications
2
27
0
Order By: Relevance
“…This further demonstrates the dominance of deep learning models over classical algorithms that rely heavily on heuristic hand-crafted features [11]. Interestingly, it also justified the applicability of the CNN-LSTM hybrid model to achieve state-of-the-art results with time series data, more importantly with activity context recognition [14,51,53,61,70] and speech recognition problems [22].…”
Section: Discussionmentioning
confidence: 72%
See 1 more Smart Citation
“…This further demonstrates the dominance of deep learning models over classical algorithms that rely heavily on heuristic hand-crafted features [11]. Interestingly, it also justified the applicability of the CNN-LSTM hybrid model to achieve state-of-the-art results with time series data, more importantly with activity context recognition [14,51,53,61,70] and speech recognition problems [22].…”
Section: Discussionmentioning
confidence: 72%
“…For both of the experiments, we used the following metrics: Precision, Recall, F-Score and the Confusion matrix. According to many authors, these are the most commonly used metrics in the field of activity recognition [10,[69][70]. As clearly illustrated in Figure 10, the dataset suffers from class imbalance.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…Additionally, CNN and LSTM are widely adopted with high accuracy rate activity recognition among the applied networks. CNN is commonly separated into numerous learning stages, each of which consists of a mix of convolutional operation and nonlinear processing units, as follows: where reveals the latent representation of the -th feature map of the current layer, is the activation function, denotes the convolution operation, indicates the -th feature map of the group of the feature maps achieved from the upper layer, and express the weights matrix and the bias of the -th feature map of the current layer, receptively [ 52 ]. In our model, the rectified linear units (ReLU) were employed as the activation functions to subsequently conduct the non-linear transformation to obtain the feature maps, denoted by: …”
Section: Methodsmentioning
confidence: 99%
“…CNN considers each frame of sensor data as independent and extracts the features for these isolated portions of data without considering the temporal contexts beyond the boundaries of the frame. Due to the continuity of sensor data flow produced by the user’s behavior, local spatial correlations and temporally long-term connections are both important to identify the landmark [ 52 ]. LSTMs with learnable gates, which modulate the flow of information and control when to forget previous hidden states, as variants of vanilla recurrent neural networks (RNNs), allow the neural network to effectively extract the long-range dependencies of time-series sensor data [ 54 ].…”
Section: Methodsmentioning
confidence: 99%
“…The proposed model was a sequential fusion of CNNs, LSTMs, and fully connected layers. Thakur et al 21 proposed a DL-based unified model composed of CNNs, autoencoders, and LSTMs. The model learnt both spatial features and temporal features from smartphone sensor data.…”
Section: Introductionmentioning
confidence: 99%