Comparison of part-of-speech and automatically derived category-based language models for speech recognition

Niesler, Thomas; Whittaker, Edward W. D.; Woodland, Philip C.

doi:10.1109/icassp.1998.674396

Cited by 43 publications

(27 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…One approach to balance the specific, but poorer coverage word-based N-gram LMs with a more generic LM is to linearly interpolate the LM probabilities. This is commonly used with class-based LMs [17] and is used in this paper with paraphrastic LMs. Let P (w|h) denote the interpolated LM probability for any in-vocabulary wordw following an arbitrary historyh, this is given by…”

Section: Paraphrastic Language Modelsmentioning

confidence: 99%

Paraphrastic language models

Liu

Gales

Woodland

2014

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

In natural languages multiple word sequences can represent the same underlying meaning. Only modelling the observed surface word sequence can result in poor context coverage, for example, when using n-gram language models (LM). To handle this issue, this paper presents a novel form of language model, the paraphrastic LM. A phrase level transduction model that is statistically learned from standard text data is used to generate paraphrase variants. LM probabilities are then estimated by maximizing their marginal probability. Significant error rate reductions of 0.5%-0.6% absolute were obtained on a state-ofthe-art conversational telephone speech recognition task using a paraphrastic multi-level LM modelling both word and phrase sequences.

show abstract

Section: Paraphrastic Language Modelsmentioning

confidence: 99%

Paraphrastic language models

Liu

Gales

Woodland

2014

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

show abstract

“…Data-driven approaches cluster words to minimize the overall perplexity of the corpus by a greedy approach [21,75]. It has been shown that data-driven approaches outperform classes based on POS tags [102].…”

Section: Class-based Lmsmentioning

confidence: 99%

Semantic language models with deep neural networks

Bayer

Riccardi

2016

Computer Speech & Language

View full text Add to dashboard Cite

Spoken language systems (SLS) communicate with users in natural language through speech. There are two main problems related to processing the spoken input in SLS. The first one is automatic speech recognition (ASR) which recognizes what the user says. The second one is spoken language understanding (SLU) which understands what the user means. We focus on the language model (LM) component of SLS. LMs constrain the search space that is used in the search for the best hypothesis. Therefore, they play a crucial role in the performance of SLS.It has long been discussed that an improvement in the recognition performance does not necessarily yield a better understanding performance. Therefore, optimization of LMs for the understanding performance is crucial. In addition, long-range dependencies in languages are hard to handle with statistical language models. These two problems are addressed in this thesis.We investigate two different LM structures. The first LM that we investigate enable SLS to understand better what they recognize by searching the ASR hypotheses for the best understanding performance. We refer to these models as joint LMs. They use lexical and semantic units jointly in the LM. The second LM structure uses the semantic context of an utterance, which can also be described as "what the system understands", to search for a better hypothesis that improves the recognition and the understanding performance. We refer to these models as semantic LMs (SELMs). SELMs use features that are based on a well established theory of lexical semantics, namely the theory of frame semantics. They incorporate the semantic features which are extracted from the ASR hypothesis into the LM and handle long-range dependencies by using the semantic relationships between words and semantic context. ASR noise is propagated to the semantic features, to suppress this noise we introduce the use of deep semantic encodings for semantic feature extraction. In this way, SELMs optimize both the recognition and the understanding performance.

show abstract

“…Experiments presented by Niesler et al (1998) show that a class-based approach utilizing POS tagging is not as efficient as application of categories based on stochastic properties of n-grams occurring in the corpus. In the case of the problem being considered here, we cannot apply the latter approach to OOC words because they are not present in the corpus.…”

Section: Combining the Lm With A Flat Word Listmentioning

confidence: 99%

“…As another option, the simpler tool Morfeusz described by Woliński (2006) can be applied to find grammatical categories of isolated words. The class-based model LM CB is then created using the standard method described by Niesler et al (1998). The standard word n-gram model LM W is also created using the same corpus.…”

Section: Combining the Lm With A Flat Word Listmentioning

confidence: 99%

Pipelined language model construction for Polish speech recognition

Sas¹,

Żołnierek²

2013

International Journal of Applied Mathematics and Computer Science

View full text Add to dashboard Cite

The aim of works described in this article is to elaborate and experimentally evaluate a consistent method of Language Model (LM) construction for the sake of Polish speech recognition. In the proposed method we tried to take into account the features and specific problems experienced in practical applications of speech recognition in the Polish language, reach inflection, a loose word order and the tendency for short word deletion. The LM is created in five stages. Each successive stage takes the model prepared at the previous stage and modifies or extends it so as to improve its properties. At the first stage, typical methods of LM smoothing are used to create the initial model. Four most frequently used methods of LM construction are here. At the second stage the model is extended in order to take into account words indirectly co-occurring in the corpus. At the next stage, LM modifications are aimed at reduction of short word deletion errors, which occur frequently in Polish speech recognition. The fourth stage extends the model by insertion of words that were not observed in the corpus. Finally the model is modified so as to assure highly accurate recognition of very important utterances. The performance of the methods applied is tested in four language domains.

show abstract

Comparison of part-of-speech and automatically derived category-based language models for speech recognition

Cited by 43 publications

References 10 publications

Paraphrastic language models

Paraphrastic language models

Semantic language models with deep neural networks

Pipelined language model construction for Polish speech recognition

Contact Info

Product

Resources

About