-A phoneme-acquisition system was developed using a computational model that explains the developmental process of human infants in the early period of acquiring language. There are two important findings in constructing an infant's acquisition of phonemes: (1) an infant's vowel like cooing tends to invoke utterances that are imitated by its caregiver, and (2) maternal imitation effectively reinforces infant vocalization. Therefore, we hypothesized that infants can acquire phonemes to imitate their caregivers' voices by trial and error, i. e., infants use self-vocalization experience to search for imitable and unimitable elements in their caregivers' voices. On the basis of this hypothesis, we constructed a phoneme acquisition process using interaction involving vowel imitation between a human and an infant model. Our infant model had a vocal tract system, called the Maeda model, and an auditory system implemented by using Mel-Frequency Cepstral Coefficients (MFCCs) through STRAIGHT analysis. We applied Recurrent Neural Network with Parametric Bias (RNNPB) to learn the experience of self-vocalization, to recognize the human voice, and to produce the sound imitated by the infant model. To evaluate imitable and unimitable sounds, we used the prediction error of the RNNPB model. The experimental results revealed that as imitation interactions were repeated, the formants of sounds imitated by our system moved closer to those of human voices, and our system could self-organize the same vowels in different continuous sounds. This suggests that our system can reflect the process of phoneme acquisition.
I. INTRODUCTIONOur goal was to clarify how to acquire the ability to distinguish phonemes in the early development of human infants. Infants can acquire spoken language through imitating the vocal output of their parents. This ability is closely related to the cognitive development of language.Developmental psychologists have demonstrated that an infant's vowel-like cooing tends to invoke utterances that are imitated by its caregiver's [1] and that maternal imitation effectively reinforces infant vocalization [2]. Infants have no innate knowledge of phonemes and regard a sound of phoneme sequences as continuous acoustic signals. As they grow, infants acquire the ability to discover phoneme units in continuous speech sounds by prosody, rhythm, stress, and whether they can imitate the sound or not.We hypothesized that infants can acquire phonemes to imitate their caregiver's voices repeatedly by trial and error, i.e., infants use self-vocalization experience to search for H. Kanda, T. Ogata, T. Takahashi, K. Komatani, and H. G. Okuno is with the Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Kyoto, Japan {hkanda, ogata, tall, komatani, okuno}@kuis.kyoto-u.ac.jp imitable and unimitable elements in their caregiver's voices. We define phoneme acquisition in this paper as follows: Infants can produce sounds close to caregivers' voices.The human-development studies ...