Deep-Learning Based Automatic Spontaneous Speech Assessment in a Data-Driven Approach for the 2017 SLaTE CALL Shared Challenge

Oh, Yoo Rhee; Jeon, Hyung-Bae; Song, Hwa Jeon; Kang, Bongkoo; Lee, Yun-Kyung; Park, Jeon-Gue; Lee, Yunkeun

doi:10.21437/slate.2017-18

Cited by 8 publications

(10 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For each test sentence formed by NW words, of which NOOV are out-of-vocabulary (OOV), we compute the following 5 features using each LM (taking inspiration for features from the works of [29,30,12,9,10]): a) log(P ) N W , that is, the average log-probability of the sentence, b) log(P OOV ) N OOV , that is, the average contribution of OOV words to the log-probability of the sentence, c) log(P )−log(P OOV ) N W , that is, the average log-difference between the two above probabilities, d) NW − N bo , where N bo is the number of back-offs applied by the LM to the input sentence (this difference is related to the frequency of n-grams in the sentence that have also been observed in the training set), e) NOOV , the number of OOVs in the sentence. Note that if word counts NW or NOOV are equal to zero (i.e.…”

Section: Classification Featuresmentioning

confidence: 99%

Automatic Assessment of Spoken Language Proficiency of Non-native Children

Gretter

Matassoni

Allgaier

et al. 2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

This paper describes technology developed to automatically grade Italian students (ages 9-16) on their English and German spoken language proficiency. The students' spoken answers are first transcribed by an automatic speech recognition (ASR) system and then scored using a feedforward neural network (NN) that processes features extracted from the automatic transcriptions. In-domain acoustic models, employing deep neural networks (DNNs), are derived by adapting the parameters of an original out of domain DNN. Automatic scores are computed for low level proficiency indicators -such as: lexical richness, syntax correctness, quality of pronunciation, discourse fluency, semantic relevance to the prompt, etc -defined by human experts in language proficiency. A set of experiments was carried out on a large set of data collected during proficiency evaluation campaigns involving thousands of students, manually scored by human experts. Obtained results are presented and discussed.

show abstract

Section: Classification Featuresmentioning

confidence: 99%

Automatic Assessment of Spoken Language Proficiency of Non-native Children

Gretter

Matassoni

Allgaier

et al. 2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

“…This new data was selected in a similar way to the first training set, to be balanced and representative of the collected data, with the additional constraint that there should be no overlap of individual students between the first task and second task. Speech data were processed through the two best speech recognisers from the first shared task [6,8] after which the two sets of output transcriptions were merged and cleaned up by transcribers at the University of Geneva.…”

Section: Datamentioning

confidence: 99%

“…The cleaned, merged transcriptions were processed through four of the best assessment systems from the first shared task [6,8,7,9] to give accept/reject decisions for the language criterion. The training data could then be divided into three groups according to the agreement among the four systems.…”

Section: Datamentioning

confidence: 99%

See 1 more Smart Citation

Overview of the 2019 Spoken CALL Shared Task

Baur

Caines

Chua

et al. 2019

8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019)

View full text Add to dashboard Cite

We present an overview of the second edition of the Spoken CALL Shared Task. Groups competed on a prompt-response task using English-language data collected, through an online CALL game, from Swiss German teens in their second and third years of learning English. Each item consists of a written German prompt and an audio file containing a spoken response. The task is to accept linguistically correct responses and reject linguistically incorrect ones, with "linguistically correct" defined by a gold standard derived from human annotations. Scoring was performed using a metric defined as the ratio of the relative rejection rates on incorrect and correct responses. The second edition received eighteen entries and showed very substantial improvement on the first edition; all entries were better than the best entry from the first edition, and the best score was about four times higher. We present the task, the resources, the results, a discussion of the metrics used, and an analysis of what makes items challenging. In particular, we present quantitative evidence suggesting that incorrect responses are much more difficult to process than correct responses, and that the most significant factor in making a response challenging is its distance from the closest training example.

show abstract

“…Several types of machine learning models have been used in the first edition of this challenge, for example, Support Vector Machine (SVM), K-Nearest Neighbor models, and Feed-Forward Neural networks [9,15,16].…”

Section: Machine Learning Modelsmentioning

confidence: 99%

Liulishuo's System for the Spoken CALL Shared Task 2018

Nguyen¹,

Chen²,

Prieto³

et al. 2018

Interspeech 2018

View full text Add to dashboard Cite

The Spoken CALL (Computer-Assisted Language Learning) 2018 shared task requires systems to automatically accept or reject each single-sentence spoken response depending on whether the response is correct given a prompt. Spoken responses are first recognized into texts and then classified as 'accept' or 'reject' based on their language and meaning. This paper describes our system for the shared task. We focused on improving speech recognition performance, developing a rich set of features to capture the linguistic and semantic meaning of the responses, and optimizing classification results for various factors (training set, n-best hypotheses of speech recognition, decision threshold, model ensemble). Our system achieves the best performance among the participating teams.

show abstract

Deep-Learning Based Automatic Spontaneous Speech Assessment in a Data-Driven Approach for the 2017 SLaTE CALL Shared Challenge

Cited by 8 publications

References 8 publications

Automatic Assessment of Spoken Language Proficiency of Non-native Children

Automatic Assessment of Spoken Language Proficiency of Non-native Children

Overview of the 2019 Spoken CALL Shared Task

Liulishuo's System for the Spoken CALL Shared Task 2018

Contact Info

Product

Resources

About