2021
DOI: 10.48550/arxiv.2111.09344
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

Abstract: The People's Speech is a free-to-download 30,000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset). The data is collected via searching the Internet for appropriately licensed audio data with existing transcriptions. We describe our data collection methodology and release our data collection system under the Apache 2.0 license. We show that a model trained on this dataset achieves a 9.98% word error rate… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 8 publications
0
0
0
Order By: Relevance
“…The same approach was used to evaluate NaijaVoice relative to VoxCeleb on the Nigerian mini database [6]. This approach to evaluation is not new, it has been applied by [14] and [51]. This research, however, puts a value to it in terms of relative purity using ( 8) and (9), as shown at the bottom of the page, for evaluating NaijaFace and NaijaVoice respectively.…”
Section: Resultsmentioning
confidence: 99%
“…The same approach was used to evaluate NaijaVoice relative to VoxCeleb on the Nigerian mini database [6]. This approach to evaluation is not new, it has been applied by [14] and [51]. This research, however, puts a value to it in terms of relative purity using ( 8) and (9), as shown at the bottom of the page, for evaluating NaijaFace and NaijaVoice respectively.…”
Section: Resultsmentioning
confidence: 99%
“…English Librispeech [7], mTEDx [18], Gi-gaSpeech [8], MLS [12], The Peoples's Speech [22], CommonVoice [11], MSR-86K…”
Section: Language Corpus Total Hoursmentioning
confidence: 99%