In the speech technology research community there is an increasing trend to use open source solutions. We present a new tool in that spirit, WaveSurfer, which has been developed at the Centre for Speech Technology at KTH. It has been designed for tasks such as viewing, editing, and labeling of audio data. WaveSurfer is built around a small core to which most functionality is added in the form of plug-ins. The tool has been designed to work on most common platforms and with the aims that it should be easy to configure and extend. WaveSurfer is provided as open source, under the GPL license with the explicit goal that the speech community jointly will improve and expand its scope and capabilities.
To be able to build acoustic models for children, that can be used in spoken dialogue systems, speech data has to be collected. Commercial recognizers available for Swedish are trained on adult speech, which makes them less suitable for children's computer-directed speech. This paper describes some experiments with on-the-fly voice transformation of children's speech. Two transformation methods were tested, one inspired by the Phase Vocoder algorithm and another by the Time-Domain Pitch-Synchronous Overlap-Add (TD-PSOLA) algorithm. The speech signal is transformed before being sent to the speech recognizer for adult speech. Our results show that this method reduces the error rates in the order of thirty to fortyfive percent for children users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.