Sign language is an important way for deaf people to understand and communicate with others. Many researchers use Wi-Fi signals to recognize hand and finger gestures in a non-invasive manner. However, Wi-Fi signals usually contain signal interference, background noise, and mixed multipath noise. In this study, Wi-Fi Channel State Information (CSI) is preprocessed by singular value decomposition (SVD) to obtain the essential signals. Sign language includes the positional relationship of gestures in space and the changes of actions over time. We propose a novel dual-output two-stream convolutional neural network. It not only combines the spatial-stream network and the motion-stream network, but also effectively alleviates the backpropagation problem of the two-stream convolutional neural network (CNN) and improves its recognition accuracy. After the two stream networks are fused, an attention mechanism is applied to select the important features learned by the two-stream networks. Our method has been validated by the public dataset SignFi and adopted five-fold cross-validation. Experimental results show that SVD preprocessing can improve the performance of our dual-output two-stream network. For home, lab, and lab + home environment, the average recognition accuracy rates are 99.13%, 96.79%, and 97.08%, respectively. Compared with other methods, our method has good performance and better generalization capability.
Wi-Fi sensing for gesture recognition systems is a fascinating and challenging research topic. We propose a multitask sign language recognition framework called Wi-SignFi, which accounts for gestures in the real world associated with various objects, actions, or scenes. The proposed framework comprises a convolutional neural network (CNN) and K-nearest neighbor (KNN) module. It is evaluated on the public SignFi dataset and achieves 98.91%, 86.67%, and 99.99% average gesture recognition accuracies on 276/150 activities, five users, and two environments, respectively. The experimental results show that the proposed gesture recognition method outperforms previous methods. Instead of converting the channel state information (CSI) data of multiple antennas into three-dimensional matrices (i.e., color images) as in the existing literature, we found that the CSI data can be converted into matrices (i.e., grayscale images) by concatenating different channels, allowing the Wi-SignFi model to balance between speed and accuracy. This finding facilitates deploying Wi-SignFi on Nvidia’s Jetson Nano edge embedded devices. We expect this work to promote the integration of Wi-Fi sensing and the Internet of Things (IoT) and improve the quality of life of the deaf community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.