2018
DOI: 10.1016/j.csl.2017.11.001
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 16 publications
(7 citation statements)
references
References 30 publications
0
7
0
Order By: Relevance
“…The deep learning models used were inspired by state-of-the-art automatic speech recognition (ASR) networks (Amodei et al, 2015). ASR systems without language models are error prone when detecting the canonical structure of resyllabified sequences (Adda-Decker et al, 2002;Mirzaei et al, 2018;Wu et al, 1997). For example, a sequence like "fade out" could be recognised as "Fay doubt" if the coda /d/ is resyllabified as the onset of the second syllable.…”
Section: B Using Deep Neural Network With Acoustic Data To Identify R...mentioning
confidence: 99%
“…The deep learning models used were inspired by state-of-the-art automatic speech recognition (ASR) networks (Amodei et al, 2015). ASR systems without language models are error prone when detecting the canonical structure of resyllabified sequences (Adda-Decker et al, 2002;Mirzaei et al, 2018;Wu et al, 1997). For example, a sequence like "fade out" could be recognised as "Fay doubt" if the coda /d/ is resyllabified as the onset of the second syllable.…”
Section: B Using Deep Neural Network With Acoustic Data To Identify R...mentioning
confidence: 99%
“…In another direction, it is worth highlighting the work from (Mirzaei et al 2018), which takes the output of automatic speech recognition and analyses the errors committed to estimating the difficulties in L2 speech. Among the most common types of errors, the authors identify homophones, minimal pairs, negatives and breached boundaries.…”
Section: Phonetic Assessmentmentioning
confidence: 99%
“…For the baseline version, we used rule-based coarse-grained level assignments to roughly categorize learners into three language proficiency levels (beginners, intermediate, advanced) based on learner's assessment tests (TOEFL/TOEIC score, speech rate tolerance, vocabulary size). Word selection is determined by defining thresholds for specific features, including word frequency and speech rate, while also incorporating additional factors like automatic speech recognition system errors, word specificity, proper names, and abbreviations (Mirzaei et al, 2018). and requirements.…”
Section: Personalized Captionmentioning
confidence: 99%