Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-2476
|View full text |Cite
|
Sign up to set email alerts
|

Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition

Abstract: Recurrent neural networks (NN) with long short-term memory (LSTM) are the current state of the art to model long term dependencies. However, recent studies indicate that NN language models (LM) need only limited length of history to achieve excellent performance. In this paper, we extend the previous investigation on LSTM network based n-gram modeling to the domain of automatic speech recognition (ASR). First, applying recent optimization techniques and up to 6-layer LSTM networks, we improve LM perplexities b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
14
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(15 citation statements)
references
References 26 publications
1
14
0
Order By: Relevance
“…Besides the related work already cited, another paper, written by Tüske et al [20] is also closely related to our work. In this comprehensive study a RNNLM, RNN n-gram models and BNLMs are compared on various English and German ASR tasks.…”
Section: Introductionsupporting
confidence: 66%
See 1 more Smart Citation
“…Besides the related work already cited, another paper, written by Tüske et al [20] is also closely related to our work. In this comprehensive study a RNNLM, RNN n-gram models and BNLMs are compared on various English and German ASR tasks.…”
Section: Introductionsupporting
confidence: 66%
“…RNN n-gram models were found to be superior to BNLMs both in terms of word perplexity and WER, whereas high order RNN n-grams were close to the performance of an unrestricted RNNLM. However, in [20] ASR results were obtained with two-pass decoding, and German (and obviously English) morphology is less complex than Hungarian.…”
Section: Introductionmentioning
confidence: 99%
“…Fluency. This is usually evaluated with a language model in many NLP applications (Peris and Casacuberta, 2015;Tüske et al, 2018). We used a two-layer recurrent neural network with gated recurrent units as a language model, and trained it on the target style part of the corpus.…”
Section: Automatic Evaluationmentioning
confidence: 99%
“…As Deep Neural Networks (DNNs) have become dominant in more and more areas of speech technology, such as speech recognition [9,13,26], speech synthesis [2,21] and language modeling [23,24,29], it is natural that recent studies have attempted to solve the ultrasound-to-speech conversion problem by employing deep learning, regardless of whether sEMG [17], ultrasound video [18] or PMA [10] is used as an input. Our team used DNNs to predict the spectral parameter values [4] and F0 [12] of a vocoder using UTI as articulatory input; in a later study we extended our method to include multi-task training [28].…”
Section: Introductionmentioning
confidence: 99%