2011
DOI: 10.1109/lsp.2011.2160850
|View full text |Cite
|
Sign up to set email alerts
|

Low Rank Language Models for Small Training Sets

Abstract: Abstract-Several language model smoothing techniques are available that are effective for a variety of tasks; however, training with small data sets is still difficult. This letter introduces the low rank language model, which uses a low rank tensor representation of joint probability distributions for parameter-tying and optimizes likelihood under a rank constraint. It obtains lower perplexity than standard smoothing techniques when the training set is small and also leads to perplexity reduction when used in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 11 publications
0
6
0
Order By: Relevance
“…Thus, we anticipate that better feature sets will lead to even larger performance gains over the baseline. Motivated by previous results [1,2], we also expect that interpolating the SLR-LM with a standard smoothed n-gram model would yield further improvements.…”
Section: Discussionmentioning
confidence: 84%
See 3 more Smart Citations
“…Thus, we anticipate that better feature sets will lead to even larger performance gains over the baseline. Motivated by previous results [1,2], we also expect that interpolating the SLR-LM with a standard smoothed n-gram model would yield further improvements.…”
Section: Discussionmentioning
confidence: 84%
“…For example, the cooccurrence statistics for a word α that has been observed 50 times may be sufficiently similar to a set of words β that have been observed hundreds or thousands of times for α's weights to be pushed into β's subspace of weights; in effect, this "fills in" missing entries from α's weight rows and columns. 1 The idea of learning and exploiting similarities between objects (e.g. words and histories) is a common theme in the literature on learning shared representations [5] and is used by language models with continuous representations of words [6,7].…”
Section: Low Rank Componentmentioning
confidence: 99%
See 2 more Smart Citations
“…There are a few low rank approaches (Saul and Pereira, 1997;Bellegarda, 2000;Hutchinson et al, 2011), but they are only effective in restricted set-tings (e.g. small training sets, or corpora divided into documents) and do not generally perform comparably to state-of-the-art models.…”
Section: Related Workmentioning
confidence: 99%