This paper describes a robust voice activity detector using an ultrasonic Doppler sonar device. An ultrasonic beam is incident on the talker's face. Facial movements result in Doppler frequency shifts in the reflected signal, that are sensed by an ultrasonic sensor. Speech-related facial movements result in identifiable patterns in the spectrum of the received signal, that can be used to identify speech activity. These sensors are not affected by even high levels of ambient audio noise. Unlike most other non-acoustic sensors, the device need not be taped to a talker. A simple yet robust method of extracting the voice activity information from the ultrasonic Doppler signal is developed and presented in this paper. The algorithm is seen to be very effective and robust to noise, and can be implemented in real time. IEEE Signal Processing Letters, Oct. 2007This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Abstract-This paper describes a robust voice activity detector using an ultrasonic Doppler sonar device. An ultrasonic beam is incident on the talker's face. Facial movements result in Doppler frequency shifts in the reflected signal, that are sensed by an ultrasonic sensor. Speech-related facial movements result in identifiable patterns in the spectrum of the received signal, that can be used to identify speech activity. These sensors are not affected by even high levels of ambient audio noise. Unlike most other non-acoustic sensors, the device need not be taped to a talker. A simple yet robust method of extracting the voice activity information from the ultrasonic Doppler signal is developed and presented in this paper. The algorithm is seen to be very effective and robust to noise, and can be implemented in real time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.