International audienceThe possibility of speech processing in the absence of an intelligible acoustic signal has given rise to the idea of a 'silent speech' interface, to be used as an aid for the speech-handicapped, or as part of a communications system operating in silence-required or high-background-noise environments. The article first outlines the emergence of the silent speech interface from the fields of speech production, automatic speech processing, speech pathology research, and telecommunications privacy issues, and then follows with a presentation of demonstrator systems based on seven different types of technologies. A concluding section underlining some of the common challenges faced by silent speech interface researchers, and ideas for possible future directions, is also provided
Wireless sensor networks (WSNs) offer an attractive solution to many environmental, security, and process monitoring problems. However, one barrier to their fuller adoption is the need to supply electrical power over extended periods of time without the need for dedicated wiring. Energy harvesting provides a potential solution to this problem in many applications. This paper reviews the characteristics and energy requirements of typical sensor network nodes, assesses a range of potential ambient energy sources, and outlines the characteristics of a wide range of energy conversion devices. It then proposes a method to compare these diverse sources and conversion mechanisms in terms of their normalised power density.
Surgical voice restoration post-laryngectomy has a number of limitations and drawbacks. The present gold standard involves the use of a tracheo-oesophageal fistula (TOF) valve to divert air from the lungs into the throat, which vibrates, and from this, speech can be formed. Not all patients can use these valves and those who do are susceptible to complications associated with valve failure. Thus there is still a place for other voice restoration options. With advances in electronic miniaturization and portable computing power a computing-intensive solution has been investigated. Magnets were placed on the lips, teeth and tongue of a volunteer causing a change in the surrounding magnetic field when the individual mouthed words. These changes were detected by 6 dual axis magnetic sensors, which were incorporated into a pair of special glasses. The resulting signals were compared to training data recorded previously by means of a dynamic time warping algorithm using dynamic programming. When compared to a small vocabulary database, the patterns were found to be recognised with an accuracy of 97% for words and 94% for phonemes. On this basis we plan to develop a speech system for patients who have lost laryngeal function.
This paper describes a technique which generates speech acoustics from articulator movements. Our motivation is to help people who can no longer speak following laryngectomy, a procedure which is carried out tens of thousands of times per year in the Western world. Our method for sensing articulator movement, Permanent Magnetic Articulography, relies on small, unobtrusive magnets attached to the lips and tongue. Changes in magnetic field caused by magnet movements are sensed and form the input to a process which is trained to estimate speech acoustics. In the experiments reported here this 'Direct Synthesis' technique is developed for normal speakers, with glued-on magnets, allowing us to train with parallel sensor and acoustic data. We describe three machine learning techniques for this task, based on Gaussian Mixture Models (GMMs), Deep Neural Networks (DNNs) and Recurrent Neural Networks (RNNs). We evaluate our techniques with objective acoustic distortion measures and subjective listening tests over spoken sentences read from novels (the CMU Arctic corpus). Our results show that the best performing technique is a bidirectional RNN (BiRNN), which employs both past and future contexts to predict the acoustics from the sensor data. BiRNNs are not suitable for synthesis in real-time but fixed-lag RNNs give similar results and, because they only look a little way into the future, overcome this problem. Listening tests show that the speech produced by this method has a natural quality which preserves the identity of the speaker. Furthermore, we obtain up to 92% intelligibility on the challenging CMU Arctic material. To our knowledge, these are the best results obtained for a silent-speech system without a restricted vocabulary and with an unobtrusive device that delivers audio in close to real time. This work promises to lead to a technology which truly will give people whose larynx has been removed their voices back.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.