Mizo, a Tibeto-Burman language of the Kuki-Chin group is primarily spoken in the northeast Indian state of Mizoram in Northeast India. Mizo has four lexical tones, namely, high (H), rising (R) , falling (F) and low (L). Mizo tones are mostly dynamic, except the H tone which is a static tone. Previous researches have reported that the rising tone in Mizo changes into low tone when it is followed by either high tone or falling tone which is regarded as rising tone sandhi. The present study analyzes the production and perception of rising tone sandhi. The production data of rising tone sandhi is carried out by comparing the F0 contours of the derived low tone of rising tone sandhi in trisyllabic phrases with the citation form of low tone and low tone in phrases. Results have shown that the F0 contour of rising tone sandhi and the canonical low tone in Mizo are different in terms of F0 contour. The result of perception study in the form of identification test has shown that the speakers of Mizo could distinguish the low tone derived out of rising tone sandhi from the canonical low tone which indicates that tone sandhi in Mizo is perceptually categorical.
Performance of speech recognition system severely degrades in noisy environment. Considering this, in this work, we present a method to improve performance of a Mizo digit recognition system in different noisy conditions using data augmentation and tonal information. Mizo is a tonal language and each digit in Mizo is spoken with one of the four tones present in the language. Therefore, the tone contains information about the spoken digit. Tone is related to the excitation source and excitation source information is robust to noisy conditions when compared with the vocal tract information. Normalized cross correlation function, pitch and pitch dynamics are used as additional features to represent the tonal information and improvement is achieved in Mel frequency cepstral coefficient (MFCC) based baseline systems in noisy conditions. Data augmentation is another technique used in the literature for robust speech recognition. Use of data augmentation further improves the performance of the Mizo digit recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.