Sarah Samson Juan scite author profile

Sarah Samson Juan

4Publications

28Citation Statements Received

64Citation Statements Given

How they've been cited

How they cite others

Affiliations

Universiti Malaysia Sarawak, Joseph Fourier University, Grenoble Computer Science Laboratory

Publications

Order By: Most citations

Using resources from a closely-related language to develop ASR for a very under-resourced language: a case study for iban

Juan¹,

Besacier²,

Lecouteux³

et al. 2015

View full text Add to dashboard Cite

This paper presents our strategies for developing an automatic speech recognition system for Iban, an under-resourced language. We faced several challenges such as no pronunciation dictionary and lack of training material for building acoustic models. To overcome these problems, we proposed approaches which exploit resources from a closely-related language (Malay). We developed a semi-supervised method for building the pronunciation dictionary and applied cross-lingual strategies for improving acoustic models trained with very limited training data. Both approaches displayed very encouraging results, which show that data from a closely-related language, if available, can be exploited to build ASR for a new language. In the final part of the paper, we present a zero-shot ASR using Malay resources that can be used as an alternative method for transcribing Iban speech.

show abstract

Quantifying the relationship between the climate and Hand-Foot-Mouth Disease (HFMD) incidences

Leong

Labadin

Rahman

et al. 2011

View full text Add to dashboard Cite

Using closely-related language to build an ASR for a very under-resourced language: Iban

Juan

Besacier

Lecouteux

et al. 2014

View full text Add to dashboard Cite

This paper describes our work on automatic speech recognition system (ASR) for an under-resourced language, namely the Iban language, which is spoken in Sarawak, a Malaysian Borneo state. To begin this study, we collected 8 hours of speech data due to no resources yet for ASR concerning this language. Following the lack of resources, we employed bootstrapping techniques on a closely-related language to build the Iban system. For this case, we utilized Malay data to bootstrap the grapheme-to-phoneme system (G2P) for the target language. We also developed several G2Ps to acquire Iban pronunciation dictionaries, which were later evaluated on the Iban ASR for obtaining the best version. Subsequently, we conducted experiments on cross-lingual ASR by using subspace Gaussian Mixture Models (SGMM) where the shared parameters obtained in either monolingual or multilingual fashion. From our observations, using out-of-language data as source language provided lower WER when Iban data is very imited.

show abstract

Performance evaluation of SIFT against common image deformations on iban plaited mat motif images

Joseph

Hipiny²,

Ujir³

et al. 2021

IJEECS

View full text Add to dashboard Cite

Decorative plaited mat is one of the many examples of rich plait work often seen on Borneo handicraft products. The plaited mats are decorated with simple and complex motif designs; each has its own special meaning and taboos. The motif designs are used as a reflection of environment and the traditional beliefs in the Iban community. In line with efforts from UNESCO’s and Sarawak Government’s, digitization, and the use of IR4.0 technologies to preserve and promote this cultural heritage is encouraged. Towards this end goal, we present a novel image dataset containing 10 Iban plaited mat motif classes. The plaited mat motifs are made of diagonal and symmetrical shapes, as well as geometric and non-geometric patterns. Classification’s accuracy using scale-invariant feature transform (SIFT) features was evaluated against 6 common image deformations: zoom+rotation, viewpoint, image blur, JPEG compression, scale and illumination, across multiple threshold values. Varying degrees of each deformation were applied to a digitally cleaned (and cropped) image of each mat motif class. We used RANSAC to remove outliers from the noisy SIFT matching result. The optimal threshold value is 2.0e-2 with a reported 100.0% matching accuracy for the scale change and zoom+rotation set.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.