This paper introduces Sound stream: a low-cost, tangible and ambidextrous controller which drives a dynamic muscle-based model of the human vocal tract for articulatory speech synthesis. The controller facilitates the multidimensional inputs which are mapped to the tongue muscles in a biomechanical modeling toolkit Artisynth using a microcontroller. As the vocal tract is a complex biological structure containing many muscles, it is a challenging and computationally expensive task to accommodate control for every muscle in the proposed scheme. So, we have followed a simplified approach by controlling the selective muscles for the efficient articulatory speech synthesis. The goal for designing an ambidextrous controller is to create new possibilities of controlling multiple parameters to vary the tongue position and shape simultaneously for generating various expressive vocal sounds. As a demonstration, the user learns to interact and control a mid-sagittal view of the tongue structure in Artisynth through a set of sensors using both hands. The Sound-Stream explores and evaluates the appropriate input and mapping methods to design a controllable speech synthesis engine. 1. Wang, J. et al. (2011) “Squeezy: Extending a multi-touch screen with force sensing objects for controlling articulatory synthesis,” in Proceedings on New Interfaces for Musical Expression, Oslo, Norway, pp. 531–532.
The simulation of two-dimensional (2D) wave propagation is an affordable computational task and its use can potentially improve time performance in vocal tracts' acoustic analysis. Several models have been designed that rely on 2D wave solvers and include 2D representations of three-dimensional (3D) vocal tract-like geometries. However, until now, only the acoustics of straight 3D tubes with circular cross-sections have been successfully replicated with this approach. Furthermore, the simulation of the resulting 2D shapes requires extremely high spatiotemporal resolutions, dramatically reducing the speed boost deriving from the usage of a 2D wave solver. In this paper, we introduce an in-progress novel vocal tract model that extends the 2D Finite-Difference Time-Domain wave solver (2.5D FDTD) by adding tube depth, derived from the area functions, to the acoustic solver. The model combines the speed of a light 2D numerical scheme with the ability to natively simulate 3D tubes that are symmetric in one dimension, hence relaxing previous resolution requirements. An implementation of the 2.5D FDTD is presented, along with evaluation of its performance in the case of static vowel modeling. The paper discusses the current features and limits of the approach, and the potential impact on computational acoustics applications.
The coupling of vocal fold (source) and vocal tract (filter) is one of the most critical factors in source-filter articulation theory. The traditional linear source-filter theory has been challenged by current research which clearly shows the impact of acoustic loading on the dynamic behavior of the vocal fold vibration as well as the variations in the glottal flow pulses’ shape. This paper outlines the underlying mechanism of source-filter interactions; demonstrates the design and working principles of coupling for the various existing vocal cord and vocal tract biomechanical models. For our study, we have considered self-oscillating lumped-element models of the acoustic source and computational models of the vocal tract as articulators. To understand the limitations of source-filter interactions which are associated with each of those models, we compare them concerning their mechanical design, acoustic and physiological characteristics and aerodynamic simulation. References: 1. Flanagan, J. L. (1968) “Source-system interaction in the vocal tract,” Ann. N.Y. Acad. Sci. 155, 9–17. 2. Lieberman, P., and Blumstein, S.E. (1988) Speech physiology, speech perception, and acoustic phonetics (Cambridge University Press, Cambridge, Mass). 3. Titze, I. R. (2008) “Nonlinear source-filter coupling in phonation: Theory,” J. Acoust. Soc. Am. 123, 2733–2749.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.