Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02 2001
DOI: 10.3115/1073083.1073105
|View full text |Cite
|
Sign up to set email alerts
|

Active learning for statistical natural language parsing

Abstract: It is necessary to have a (large) annotated corpus to build a statistical parser. Acquisition of such a corpus is costly and time-consuming. This paper presents a method to reduce this demand using active learning, which selects what samples to annotate, instead of annotating blindly the whole training corpus.Sample selection for annotation is based upon "representativeness" and "usefulness".A model-based distance is proposed to measure the difference of two sentences and their most likely parse trees. Based o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
82
0

Year Published

2005
2005
2022
2022

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 105 publications
(82 citation statements)
references
References 14 publications
0
82
0
Order By: Relevance
“…Such has happened in the case of part of speech tagging, where the query by committee methods are generalized to apply to hidden Markov models (Dagan and Engelson 1995). In parsing, uncertainty sampling (Hwa 2004) and other heuristic approaches have been applied (Tang et al 2002). A recent trend in the pool-based active learning literature has been to take various approaches, usually uncertainty sampling or query by committee and try to improve performance through additional heuristics.…”
Section: Heuristic Generalizations and Variationsmentioning
confidence: 99%
See 2 more Smart Citations
“…Such has happened in the case of part of speech tagging, where the query by committee methods are generalized to apply to hidden Markov models (Dagan and Engelson 1995). In parsing, uncertainty sampling (Hwa 2004) and other heuristic approaches have been applied (Tang et al 2002). A recent trend in the pool-based active learning literature has been to take various approaches, usually uncertainty sampling or query by committee and try to improve performance through additional heuristics.…”
Section: Heuristic Generalizations and Variationsmentioning
confidence: 99%
“…A trend of the last ten years (Abe and Mamitsuka 1998;Banko and Brill 2001;Chen et al 2006;Dagan and Engelson 1995;Hwa 2004;Lewis and Gale 1994;McCallum and Nigam 1998;Melville and Mooney 2004;Roy and McCallum 2001;Tang et al 2002) has been to employ heuristic methods of active learning with no explicitly defined objective function. Uncertainty sampling (Lewis and Gale 1994), query by committee (Seung et al 1992), 1 and variants have proven particularly attractive because of their portability across a wide spectrum of machine learning algorithms.…”
Section: Background and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…One actively researched approach to this problem is to develop weakly supervised algorithms that require less training data, such as active learning (Hermjakob and Mooney 1997;Tang et al 2002;Baldridge and Osborne 2003;Hwa 2004) and co-training (Sarkar 2001;Steedman et al 2003). In this article, we explore an alternative: using parallel text as a means for transferring syntactic knowledge from a resource-rich language to a language with fewer resources.…”
Section: Introductionmentioning
confidence: 99%
“…AL has been successfully applied to many tasks in natural language processing, including parsing (Tang et al, 2002), named entity recognition (Miller et al, 2004), to mention just a few. See (Olsson, 2009) for a comprehensie overview of the application of AL to natural language processing.…”
Section: Introduction and Related Researchmentioning
confidence: 99%