Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system

Liu, Danyang; Ji, Xu; Zhang, Pengyuan; Yan, Yonghong

doi:10.1109/jas.2019.1911693

Cited by 10 publications

(3 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A generic ASR model can also be adapted to another narrow domain using DTL. With the help of highresource languages, several knowledge transfer methods are investigated in [129] to overcome the data sparsity problem. The first is the DTL and fine-tuning techniques, which uses a well-trained neural network to initialize the LHN parameters.…”

Section: Cross-language Dtlmentioning

confidence: 99%

“…Multilingual training can be thought of as a series of shared hidden layers (SHL) and language-specific layers or classifier layers for various languages. The source model's SHL serve as a feature converter, converting various language features to a common feature space [129]. However, some language-dependent features may exist in the common feature space, which is not a positive factor for cross-lingual knowledge transfer.…”

Section: Adversarial Tl-based Asrmentioning

confidence: 99%

See 1 more Smart Citation

Video surveillance using deep transfer learning and deep domain adaptation: Towards better generalization

Himeur

Al‐Maadeed

Kheddar

et al. 2023

Engineering Applications of Artificial Intelligence

View full text Add to dashboard Cite

Section: Cross-language Dtlmentioning

confidence: 99%

Section: Adversarial Tl-based Asrmentioning

confidence: 99%

Video surveillance using deep transfer learning and deep domain adaptation: Towards better generalization

Himeur

Al‐Maadeed

Kheddar

et al. 2023

Engineering Applications of Artificial Intelligence

View full text Add to dashboard Cite

“…In 2014, Yosinsk of Cornell University carried out a study on the portability of deep neural networks based on ImageNet data sets [45]- [46]. The results show that: (1) with the help of transfer learning, it is better to use an existing network than a neural network whose weights are randomly initialized and trained with a small amount of data; (2) finetuning in neural network parameters can achieve better results of transfer learning.…”

Section: Transfer Learningmentioning

confidence: 99%

End-to-End Amdo-Tibetan Speech Recognition Based on Knowledge Transfer

Zhu

Huang

2020

IEEE Access

View full text Add to dashboard Cite

The end-to-end speech recognition technology solves the problem that each component is independent and models cannot be jointly optimized in the traditional speech recognition model. It incorporates such components as the acoustic model, language model, and decoding unit of the hybrid model into a single neural network, that can avoid the inherent defects of multiple modules and greatly reduces the complexity of the speech recognition model. In this research, an Amdo-Tibetan speech recognition system is constructed based on Listen, Attend and Spell (LAS) model by the end-to-end speech recognition technology. It can realize the direct conversion from Amdo-Tibetan speech sequence to the corresponding character sequence and greatly reduces the difficulty of building the Amdo-Tibetan speech recognition model. To further improve the performance of the proposed system, the following improvements have been made: firstly, the Multi-Head attention mechanism is introduced to improve the alignment accuracy between state vectors of decoder and encoder; secondly, the label smoothing technique is adopted to solve the problem of over-fitting; thirdly, an N-gram language model is combined with the LAS model to increase the accuracy of speech recognition and the maximum mutual information (MMI) criterion is employed for discriminative training; and finally, transfer learning is utilized to overcome the problem of insufficient training data. Experimental results show that the proposed model can significantly enhance the performance of Amdo-Tibetan speech recognition.

show abstract

Weakly Correlated Knowledge Integration for Few-shot Image Classification

2022

View full text Add to dashboard Cite

Various few-shot image classification methods indicate that transferring knowledge from other sources can improve the accuracy of the classification. However, most of these methods work with one single source or use only closely correlated knowledge sources. In this paper, we propose a novel weakly correlated knowledge integration (WCKI) framework to address these issues. More specifically, we propose a unified knowledge graph (UKG) to integrate knowledge transferred from different sources (i.e., visual domain and textual domain). Moreover, a graph attention module is proposed to sample the subgraph from the UKG with low complexity. To avoid explicitly aligning the visual features to the potentially biased and weakly correlated knowledge space, we sample a task-specific subgraph from UKG and append it as latent variables. Our framework demonstrates significant improvements on multiple few-shot image classification datasets.

show abstract

Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system

Cited by 10 publications

References 31 publications

Video surveillance using deep transfer learning and deep domain adaptation: Towards better generalization

Video surveillance using deep transfer learning and deep domain adaptation: Towards better generalization

End-to-End Amdo-Tibetan Speech Recognition Based on Knowledge Transfer

Weakly Correlated Knowledge Integration for Few-shot Image Classification

Contact Info

Product

Resources

About