2003
DOI: 10.1007/978-3-540-39857-8_37
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing Local Probability Models for Statistical Parsing

Abstract: Abstract. This paper studies the properties and performance of models for estimating local probability distributions which are used as components of larger probabilistic systems -history-based generative parsing models. We report experimental results showing that memory-based learning outperforms many commonly used methods for this task (Witten-Bell, Jelinek-Mercer with fixed weights, decision trees, and log-linear models). However, we can connect these results with the commonly used general class of deleted i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2004
2004
2016
2016

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 16 publications
0
3
0
Order By: Relevance
“…in 1986 [16], has increasingly been away from the sole use of MLE and toward an alternative approach very similar to that prescribed by Equation 2 known as global discriminative training [17-19] or conditional maximum likelihood [20]. The problem also appears in a slightly different form in the related field of statistical natural language parsing, in which it has been suggested that global methods for optimizing competing stochastic grammar models may improve the accuracy of systems at the level of whole-sentence parses [21]. Maximum discrimination HMMs have already been applied successfully to problems in the realm of biological sequence analysis [22], though their use in gene finding has apparently not yet seen widespread adoption.…”
Section: Introductionmentioning
confidence: 99%
“…in 1986 [16], has increasingly been away from the sole use of MLE and toward an alternative approach very similar to that prescribed by Equation 2 known as global discriminative training [17-19] or conditional maximum likelihood [20]. The problem also appears in a slightly different form in the related field of statistical natural language parsing, in which it has been suggested that global methods for optimizing competing stochastic grammar models may improve the accuracy of systems at the level of whole-sentence parses [21]. Maximum discrimination HMMs have already been applied successfully to problems in the realm of biological sequence analysis [22], though their use in gene finding has apparently not yet seen widespread adoption.…”
Section: Introductionmentioning
confidence: 99%
“…In testing, we only consider ambiguous sentences, while unambiguous ones may be used in training. A previous version of the corpus, the 1st Growth, was used in the experiments reported in the papers Toutanova et al, , 2003b. The 3rd Growth of Redwoods is much more ambiguous than the previous version because of grammar changes and inclusion of highly ambiguous sentences that were initially excluded.…”
Section: Resultsmentioning
confidence: 99%
“…To populate a knowledge base, Riedel et al (2013) jointly learned latent feature vectors of entities, relational patterns, and relation types in the knowledge base. Toutanova et al (2015) adapted CNN to capture the compositional structure of a relational pattern during the joint learning. For open domain question answering, Yih et al (2014) proposed the method to map an interrogative sentence on an entity and a relation type contained in a knowledge base by using CNN.…”
Section: Related Workmentioning
confidence: 99%