This paper presents a deep-learning based assessment method of a spoken computer-assisted language learning (CALL) for a non-native child speaker, which is performed in a data-driven approach rather than in a rule-based approach. Especially, we focus on the spoken CALL assessment of the 2017 SLaTE challenge. To this end, the proposed method consists of four main steps: speech recognition, meaning feature extraction, grammar feature extraction, and deep-learning based assessment. At first, speech recognition is performed on an input speech using three automatic speech recognition (ASR) systems. Second, twenty-seven meaning features are extracted from the recognized texts via the three ASRs using language models (LMs), sentence-embedding models, and wordembedding models. Third, twenty-two grammar features are extracted from the recognized text via one ASR system using linear-order LMs and hierarchical-order LMs. Fourth, the extracted forty-nine features are fed into a full-connected deep neural network (DNN) based model for the classification of acceptance or rejection. Finally, an assessment is performed by comparing the probability of a output unit of the DNN-based classifier with a predefined threshold. For the experiments of a spoken CALL assessment, we use English spoken utterances by Swiss German teenagers. It is shown from the experiments that the D score is 4.37 for the spoken CALL assessment system employing the proposed method.
Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domainspecific LMs; therefore, the clustering and weightestimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting. and human-like intelligent systems/robots based on biological information processing mechanisms in our brain. He has worked on computational models of the auditory and visual pathways; unsupervised and supervised learning architectures and algorithms; active learning; situation awareness from environmental sound; and top-down selective attention.
Owing to the rising demand for second-language learning and the advances in machine learning, there has been increase in the need for spoken computer-assisted language learning (CALL) applications [1,2]. Moreover, with the spread of Korean popular culture overseas [3], the need for Korean language learning has prompted the development of such CALL applications for non-native Korean learners. Among the spoken Korean CALL applications, this paper focuses on an automatic speech recognition (ASR)-based proficiency assessment for non-native Korean speech. Non-native speech significantly degrades the performance of the ASR used in a spoken CALL owing to the pronunciation variabilities in non-native speech [4,5]. Consequently, numerous research results have been reported on automatic proficiency assessment methods for non-native speech that is read aloud [6-13] and for spontaneous speech [14-17]. However, there has been limited research on proficiency assessment of non-native Korean speech [18]. Moreover, most research has been focused on the analysis of pronunciation variabilities in non-native Korean speech. For instance, [19,20] analyzes the pronunciation variabilities of Korean spoken by Japanese and Chinese learners using contrastive and
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.