2021 IEEE Spoken Language Technology Workshop (SLT) 2021
DOI: 10.1109/slt48900.2021.9383608
|View full text |Cite
|
Sign up to set email alerts
|

The SLT 2021 Children Speech Recognition Challenge: Open Datasets, Rules and Baselines

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 14 publications
0
4
0
1
Order By: Relevance
“…The use of SSL for child ASR was first seen at Interspeech2021, where a model using SSL [32] received first place for non-native child speech challenge. A similar use case [24] was also presented in the SLT 2021 children speech recognition challenge [33]. Another approach is used in [34], where the author uses a bidirectional unsupervised model pretraining with child speech ASR.…”
Section: B Self-supervised Learning For Child Asrmentioning
confidence: 99%
“…The use of SSL for child ASR was first seen at Interspeech2021, where a model using SSL [32] received first place for non-native child speech challenge. A similar use case [24] was also presented in the SLT 2021 children speech recognition challenge [33]. Another approach is used in [34], where the author uses a bidirectional unsupervised model pretraining with child speech ASR.…”
Section: B Self-supervised Learning For Child Asrmentioning
confidence: 99%
“…Noise from background sources affects capacity of fault tolerance, reliability, accuracy, efficiency and performance of the audio input processing (17) system thus resulting in low output. Another issue is the synchronization of the user reaction with that of ready audio input device.…”
Section: Efficiency and Performance Of The Input System For Audio Pro...mentioning
confidence: 99%
“…Dentre os problemas existentes está a necessidade de amplo volume de dados rotulados para treinamento adequado de modelos que sejam capazes de identificar variac ¸ões de fala e regionalismos. Recentemente, abordagens com treinamento autosupervisionado têm permitido o surgimento de modelos pré-treinados que podem ser especializados para determinadas tarefas utilizando conjuntos mais reduzidos de áudios, como é o caso do Wav2Vec2 [Yu et al 2021, Fan et al 2021, Jain et al 2022]. Contudo, essa abordagem ainda não foi explorada no contexto de avaliac ¸ão de fluência.…”
Section: Trabalhos Relacionadosunclassified