2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech) 2017
DOI: 10.1109/robomech.2017.8261149
|View full text |Cite
|
Sign up to set email alerts
|

Bilateral G2P accuracy: Measuring the effect of variants

Abstract: Incorporating pronunciation variants in a dictionary is controversial, as this can be either advantageous or detrimental for a speech recognition system. Grapheme-tophoneme (G2P) accuracy can help guide this decision, but calculating the G2P accuracy of variant-based dictionaries is not fully straightforward. We propose a variant matching technique to measure G2P accuracy in a principled way, when both the reference and hypothesised dictionaries may include variants. We use the new measure to evaluate G2P accu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…The workshop had a shared task in language identification, with eight different teams submitting systems on the four language pairs included (Spanish-English, Nepali-English, Mandarin-English and Modern Standard Arabic-Egyptian Arabic) (Solorio et al, 2014). Additionally, prior to this workshop, some work had been done on word-level language identification in Turkish-Dutch data (Nguyen and Dogruöz, 2013) and on language identification on isolated tokens in South African languages (Giwa and Davel, 2013), both with an eye towards analyzing codeswitching.…”
Section: Language Identificationmentioning
confidence: 99%
See 1 more Smart Citation
“…The workshop had a shared task in language identification, with eight different teams submitting systems on the four language pairs included (Spanish-English, Nepali-English, Mandarin-English and Modern Standard Arabic-Egyptian Arabic) (Solorio et al, 2014). Additionally, prior to this workshop, some work had been done on word-level language identification in Turkish-Dutch data (Nguyen and Dogruöz, 2013) and on language identification on isolated tokens in South African languages (Giwa and Davel, 2013), both with an eye towards analyzing codeswitching.…”
Section: Language Identificationmentioning
confidence: 99%
“…Most, if not all, of the previous approaches to word-level language identification utilized character n-grams as one of the primary features (Nguyen and Dogruöz, 2013;Giwa and Davel, 2013;Lin et al, 2014;Chittaranjan et al, 2014;Solorio et al, 2014). Those focused on intrasentential codeswitching also utilized varying amounts of context.…”
Section: Language Identificationmentioning
confidence: 99%