2014
DOI: 10.1016/j.specom.2013.07.001
|View full text |Cite
|
Sign up to set email alerts
|

A smartphone-based ASR data collection tool for under-resourced languages

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
40
0
1

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 74 publications
(44 citation statements)
references
References 11 publications
0
40
0
1
Order By: Relevance
“…For the Xitsonga experiments, we use data from the NCHLT corpus [37]. Around 12k UTD terms from 24 speakers are discovered from the complete unlabelled training set, with 6.5k true words used for testing.…”
Section: A Datamentioning
confidence: 99%
“…For the Xitsonga experiments, we use data from the NCHLT corpus [37]. Around 12k UTD terms from 24 speakers are discovered from the complete unlabelled training set, with 6.5k true words used for testing.…”
Section: A Datamentioning
confidence: 99%
“…The other set contains carefully uttered, read speech in Xitsonga (2h 29min), a southern African Bantu language. The latter is an excerpt of the NCHLT corpus [22]. All speech segments contain non-overlapping speech of exactly one speaker and are free of non-human noises and pauses.…”
Section: Datamentioning
confidence: 99%
“…As for the second experiment, a 10.5-h and 12-talker subset of the American English Buckeye corpus [24] and a Tsonga dataset [25] containing a total of 4.4 hours of speech from 24 talkers were used similarly to the ZS-2015 challenge (see [16] for details). Following [1], syllable clustering was done in a speaker dependent setting in the second experiment.…”
Section: Data and Pre-processingmentioning
confidence: 99%