Learners of a second language practice their pronunciation by listening to and imitating utterances from native speakers. Recent research has shown that choosing a well-matched native speaker to imitate can have a positive impact on pronunciation training. Here we propose a voicetransformation technique that can be used to generate the (arguably) ideal voice to imitate: the own voice of the learner with a native accent. Our work extends previous research, which suggests that providing learners with prosodically corrected versions of their utterances can be a suitable form of feedback in computer assisted pronunciation training. Our technique provides a conversion of both prosodic and segmental characteristics by means of a pitch-synchronous decomposition of speech into glottal excitation and spectral envelope. We apply the technique to a corpus containing parallel recordings of foreign-accented and native-accented utterances, and validate the resulting accent conversions through a series of perceptual experiments. Our results indicate that the technique can reduce foreign accentedness without significantly altering the voice quality properties of the foreign speaker. Finally, we propose a pedagogical strategy for integrating accent conversion as a form of behavioral shaping in computer assisted pronunciation training.
We describe and compare three methods that can be used to normalize articulatory data across speakers. The methods seek to explain systematic anatomical differences between a source and target speaker without modifying the articulatory velocities of the source speaker. The first method is the classical Procrustes transform, which allows for a global translation, rotation, and scaling of articulator positions. We present an extension to the Procrustes transform that allows independent translations of each articulator. The additional parameters provide a 35% increase in articulatory similarity between pairs of speakers when compared to classical Procrustes. The proposed extension is finally coupled with a data-driven articulatory synthesizer in an analysis-bysynthesis loop to select model parameters that best explain the predicted acoustic (rather than articulatory) differences. This normalization method is able to increase acoustic similarity between source and the target speaker by 34%. However, it also reduces articulatory similarity by 22%, which suggest that improvements in acoustic similarity do not necessarily require an increase in articulatory similarity.
The objective of this research is to develop a low-cost infrared absorption spectroscope based on linear variable filter technology for the automated detection of concentrated gases and vapors, and the semiautomated detection of liquids. This instrument represents an alternative to electronic-nose devices based on cross-selective gas sensor arrays. Instead, the proposed instrument uses the concept of computational "pseudosensors," whereby spectral lines in an analytical instrument are clustered into groups and used as independent variables. We characterize the system on a database of chemical mixtures, and evaluate it on two real-world applications in the foodstuffs domain: oil adulteration and transfatty acid detection. Our results show that the proposed system is a viable low-resolution, low-cost analytical technique for niche applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.