ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054362
|View full text |Cite
|
Sign up to set email alerts
|

Universal Phone Recognition with a Multilingual Allophone System

Abstract: Multilingual models can improve language processing, particularly for low resource situations, by sharing parameters across languages. Multilingual acoustic models, however, generally ignore the difference between phonemes (sounds that can support lexical contrasts in a particular language) and their corresponding phones (the sounds that are actually spoken, which are language independent). This can lead to performance degradation when combining a variety of training languages, as identically annotated phoneme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
60
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 78 publications
(62 citation statements)
references
References 16 publications
2
60
0
Order By: Relevance
“…Our work is closest to [9] where authors discover symbolic units in an unsupervised fashion for speech to speech translation. Contrary to this work, we employ the symbolic units generated by Allosaurus [10] which is trained in a supervised fashion.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Our work is closest to [9] where authors discover symbolic units in an unsupervised fashion for speech to speech translation. Contrary to this work, we employ the symbolic units generated by Allosaurus [10] which is trained in a supervised fashion.…”
Section: Related Workmentioning
confidence: 99%
“…The dataset in each language contains two voices -one male and one female. The audios are then passes into Allosaurus [10] to discover phonetic units and create a phonetic transcription of the audio.…”
Section: Dataset For Indic Languagesmentioning
confidence: 99%
See 1 more Smart Citation
“…Similarly to LENA (Xu et al, 2008), we explore the use of automatic phone recognition as a feature extractor for ALUC estimation. For this purpose, we use Allosaurus 2 , a languageindependent phone recognizer (Li et al, 2020). Allosaurus is a multilayer Long Short-Term Memory (LSTM) neural network model trained on 12 different languages for phone and phoneme recognition.…”
Section: Phone Recognitionmentioning
confidence: 99%
“…For ASR it is possible to combine a target-language language model with an acoustic model from a phonologically similar language, with no need for parallel datasets of audio recordings and transcriptions . Such approaches are likely to get even more effective with nearly-universal acoustic models (Li et al, 2020) and more scalable grapheme-to-phoneme modeling approaches (Deri and Knight, 2016;Mortensen et al, 2018;Bleyan et al, 2019;Ritchie et al, 2020;. Even if more work is needed to establish when such approaches will work well (Marchisio et al, 2020;Artetxe et al, 2020;Wu and Dredze, 2020), having useful monolingual text corpora across languages is clearly a prerequisite to exploring such approaches further.…”
Section: Introductionmentioning
confidence: 99%