2004
DOI: 10.1162/0891201041850894
|View full text |Cite
|
Sign up to set email alerts
|

Sample Selection for Statistical Parsing

Abstract: Corpus-based statistical parsing relies on using large quantities of annotated text as training examples. Building this kind of resource is expensive and labor-intensive. This work proposes to use sample selection to find helpful training examples and reduce human effort spent on annotating less informative ones. We consider several criteria for predicting whether unlabeled data might be a helpful training example. Experiments are performed across two syntactic learning tasks and within the single task of pars… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
81
0

Year Published

2007
2007
2018
2018

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 96 publications
(86 citation statements)
references
References 24 publications
1
81
0
Order By: Relevance
“…We use this algorithm to compute the true sequence entropy (3) for active learning in a constant-time factor of Viterbi's complexity. Hwa (2004) employed a similar approach for active learning with probabilistic context-free grammars.…”
Section: Uncertainty Samplingmentioning
confidence: 99%
“…We use this algorithm to compute the true sequence entropy (3) for active learning in a constant-time factor of Viterbi's complexity. Hwa (2004) employed a similar approach for active learning with probabilistic context-free grammars.…”
Section: Uncertainty Samplingmentioning
confidence: 99%
“…A trend of the last ten years (Abe and Mamitsuka 1998;Banko and Brill 2001;Chen et al 2006;Dagan and Engelson 1995;Hwa 2004;Lewis and Gale 1994;McCallum and Nigam 1998;Melville and Mooney 2004;Roy and McCallum 2001;Tang et al 2002) has been to employ heuristic methods of active learning with no explicitly defined objective function. Uncertainty sampling (Lewis and Gale 1994), query by committee (Seung et al 1992), 1 and variants have proven particularly attractive because of their portability across a wide spectrum of machine learning algorithms.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Active Learning (AL) (Settles, 2009;Settles and Craven, 2008;Hwa, 2004;Tong and Koller, 2001) builds another important set of related work. Our method is inspired by uncertainty sampling.…”
Section: Related Workmentioning
confidence: 99%