Transfer learning from a multilingual model has shown favorable results on low-resource automatic speech recognition (ASR). However, full-model fine-tuning generates a separate model for every target language and is not suitable for deploying and maintaining in production. The key challenge lies in how to efficiently extend the pre-trained model with fewer parameters. In this paper, we propose to combine the adapter module with meta-learning algorithms to achieve high recognition performance under low-resource settings and improve the parameter-efficiency of the model. Extensive experiments show that our methods can achieve comparable or even superior recognition rates than the state-of-the-art baselines on low-resource languages, especially under very-low-resource conditions, with a significantly smaller model profile.
Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language. Since the low-resource language has limited training data, speech recognition models can easily overfit. In this paper, we propose to use adapters to investigate the performance of multiple adapters for parameterefficient cross-lingual speech adaptation. Based on our previous MetaAdapter that implicitly leverages adapters, we propose a novel algorithms called SimAdapter for explicitly learning knowledge from adapters. Our algorithm leverages adapters which can be easily integrated into the Transformer structure. MetaAdapter leverages meta-learning to transfer the general knowledge from training data to the test language. SimAdapter aims to learn the similarities between the source and target languages during fine-tuning using the adapters. We conduct extensive experiments on five-low-resource languages in Common Voice dataset. Results demonstrate that our MetaAdapter and SimAdapter methods can reduce WER by 2.98% and 2.55% with only 2.5% and 15.5% of trainable parameters compared to the strong full-model fine-tuning baseline. Moreover, we also show that these two novel algorithms can be integrated for better performance with up to 3.55% relative WER reduction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.