This paper presents our strategies for developing an automatic speech recognition system for Iban, an under-resourced language. We faced several challenges such as no pronunciation dictionary and lack of training material for building acoustic models. To overcome these problems, we proposed approaches which exploit resources from a closely-related language (Malay). We developed a semi-supervised method for building the pronunciation dictionary and applied cross-lingual strategies for improving acoustic models trained with very limited training data. Both approaches displayed very encouraging results, which show that data from a closely-related language, if available, can be exploited to build ASR for a new language. In the final part of the paper, we present a zero-shot ASR using Malay resources that can be used as an alternative method for transcribing Iban speech.
This paper describes our work on automatic speech recognition system (ASR) for an under-resourced language, namely the Iban language, which is spoken in Sarawak, a Malaysian Borneo state. To begin this study, we collected 8 hours of speech data due to no resources yet for ASR concerning this language. Following the lack of resources, we employed bootstrapping techniques on a closely-related language to build the Iban system. For this case, we utilized Malay data to bootstrap the grapheme-to-phoneme system (G2P) for the target language. We also developed several G2Ps to acquire Iban pronunciation dictionaries, which were later evaluated on the Iban ASR for obtaining the best version. Subsequently, we conducted experiments on cross-lingual ASR by using subspace Gaussian Mixture Models (SGMM) where the shared parameters obtained in either monolingual or multilingual fashion. From our observations, using out-of-language data as source language provided lower WER when Iban data is very imited.
Decorative plaited mat is one of the many examples of rich plait work often seen on Borneo handicraft products. The plaited mats are decorated with simple and complex motif designs; each has its own special meaning and taboos. The motif designs are used as a reflection of environment and the traditional beliefs in the Iban community. In line with efforts from UNESCO’s and Sarawak Government’s, digitization, and the use of IR4.0 technologies to preserve and promote this cultural heritage is encouraged. Towards this end goal, we present a novel image dataset containing 10 Iban plaited mat motif classes. The plaited mat motifs are made of diagonal and symmetrical shapes, as well as geometric and non-geometric patterns. Classification’s accuracy using scale-invariant feature transform (SIFT) features was evaluated against 6 common image deformations: zoom+rotation, viewpoint, image blur, JPEG compression, scale and illumination, across multiple threshold values. Varying degrees of each deformation were applied to a digitally cleaned (and cropped) image of each mat motif class. We used RANSAC to remove outliers from the noisy SIFT matching result. The optimal threshold value is 2.0e-2 with a reported 100.0% matching accuracy for the scale change and zoom+rotation set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.