Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-86
|View full text |Cite
|
Sign up to set email alerts
|

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
11
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(12 citation statements)
references
References 0 publications
1
11
0
Order By: Relevance
“…Compared to [22], the proposed mixup is able to generate as many as artificial word samples with a small amount of taskrelated unlabeled data. Different from [16,26], where the label of generated training samples is binary(e.g., correct or mispronounced), proposed mixup can generate continuous scores for augmented data.…”
Section: Mixup Data Generationmentioning
confidence: 99%
See 2 more Smart Citations
“…Compared to [22], the proposed mixup is able to generate as many as artificial word samples with a small amount of taskrelated unlabeled data. Different from [16,26], where the label of generated training samples is binary(e.g., correct or mispronounced), proposed mixup can generate continuous scores for augmented data.…”
Section: Mixup Data Generationmentioning
confidence: 99%
“…The comparison in [25] show that the largest nonnative corpus contains 90,841 utterances, but it is not publicly available. To tackle the limited non-native data problem, data augmentation techniques have been investigated in some previous work [16,22,26]. In [26], Text-To-Speech (TTS) was used to synthesize "incorrect stress" samples on top of modified lexical stress.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The label information also includes an ID representing the pronounced words from 0 to 99. The dataset is motivated by the lack of data for training or finetuning speech representation models, such as the "wav2vec" model [2], and "HuBERT" [3], and to facilitate the development of Arabic language pronunciation mistake identifiers [4], [5], [6].…”
Section: Dataset Descriptionmentioning
confidence: 99%
“…In recent years, Computer Aided Pronunciation Training (CAPT) tools have been developed to provide diagnosis and feedback on phonetic-level errors (phoneme substitution, deletion, insertion [1,2,3,4,5]) and prosodic-level errors (e.g. lexical stress, intonation [6]). In this study, we focus on detecting phonetic-level pronunciation errors for L2 speech intelligibility and accentedness assessment.…”
Section: Introductionmentioning
confidence: 99%