Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics on - EACL '09 2009
DOI: 10.3115/1609067.1609158
|View full text |Cite
|
Sign up to set email alerts
|

Learning efficient parsing

Abstract: A corpus-based technique is described to improve the efficiency of wide-coverage high-accuracy parsers. By keeping track of the derivation steps which lead to the best parse for a very large collection of sentences, the parser learns which parse steps can be filtered without significant loss in parsing accuracy, but with an important increase in parsing efficiency. An interesting characteristic of our approach is that it is self-learning, in the sense that it uses unannotated corpora.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2011
2011
2012
2012

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 6 publications
0
2
0
Order By: Relevance
“…With these options, the parser delivers a single parse, which it believes is the best parse according to a variety of heuristics. These include the disambiguation model and various optimizations of the parser presented in [9,16,17]. Furthermore, a timeout is enforced in order that the parser cannot spend more than 190 s on a single sentence.…”
Section: Estimation Of the Quality Of Lassy Largementioning
confidence: 99%
“…With these options, the parser delivers a single parse, which it believes is the best parse according to a variety of heuristics. These include the disambiguation model and various optimizations of the parser presented in [9,16,17]. Furthermore, a timeout is enforced in order that the parser cannot spend more than 190 s on a single sentence.…”
Section: Estimation Of the Quality Of Lassy Largementioning
confidence: 99%
“…Its grammar is designed following ideas of Head-driven Phrase Structure Grammar (Pollard and Sag, 1994), it uses a maximum-entropy model for statistical disambiguation, and coverage has been increased over the years by means of semiautomatic extension of the lexicon based on error-mining (van Noord, 2004). Efficiency is improved by using a part-of-speech tagger to filter out unlikely POS tags before parsing (Prins and van Noord, 2001), and by means of a technique which filters unlikely derivations based on statistics collected from automatically parsed corpora (van Noord, 2009). Alpino is a crucial component of Joost, an open-domain question-answering system for Dutch .…”
Section: Dependency Information For Question Answering and Relation Ementioning
confidence: 99%