2011
DOI: 10.1007/s10579-011-9142-3
|View full text |Cite
|
Sign up to set email alerts
|

An annotated corpus for the analysis of VP ellipsis

Abstract: Verb Phrase Ellipsis (VPE) has been studied in great depth in theoretical linguistics, but empirical studies of VPE are rare. We extend the few previous corpus studies with an annotated corpus of VPE in all 25 sections of the Wall Street Journal corpus (WSJ) distributed with the Penn Treebank. We annotated the raw files using a stand-off annotation scheme that codes the auxiliary verb triggering the elided verb phrase, the start and end of the antecedent, the syntactic type of antecedent (VP, TV, NP, PP or AP)… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
60
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 33 publications
(61 citation statements)
references
References 20 publications
1
60
0
Order By: Relevance
“…The primary results we present in this section are obtained through 5-fold cross validation over all 25 sections of the automatically-parsed dataset. We use cross validation because the train-test split suggested by Bos and Spenader (2011) could result in highly varied results due to the small size of the dataset (see Table 1). Because the vast majority of auxiliaries do not trigger VPE, we over-sample the positive cases during training.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The primary results we present in this section are obtained through 5-fold cross validation over all 25 sections of the automatically-parsed dataset. We use cross validation because the train-test split suggested by Bos and Spenader (2011) could result in highly varied results due to the small size of the dataset (see Table 1). Because the vast majority of auxiliaries do not trigger VPE, we over-sample the positive cases during training.…”
Section: Methodsmentioning
confidence: 99%
“…We divide auxiliaries into the six different categories shown in Table 1, which will be relevant for our feature extraction and model training process, as we will describe. This division is motivated by the fact that different auxiliaries exhibit different behaviours (Bos and Spenader, 2011). The results we present on the different auxiliary categories (see Tables 2 and 4) are obtained from training a single classifier over the entire dataset and then testing on auxiliaries from each category, with the ALL result being the accuracy obtained over all of the test data.…”
Section: Approach and Datamentioning
confidence: 99%
See 1 more Smart Citation
“…As far as we know, there are precisely seven systematic corpus annotations of ellipsis, four focusing on verb phrase ellipsis (essentially, VPE and a handful of similar verbal processes, like pseudogapping and comparative deletion) (Hardt, 1997;Nielsen, 2005;Bos and Spenader, 2011;Shahabi and Baptista, 2012) and three on sluicing (Fernández et al, 2005;Beecher, 2008;Nykiel, 2010).…”
Section: Related Workmentioning
confidence: 99%
“…In addition to coding VPE antecedents, he provides text corresponding to an intuitive paraphrase of the ellipsis site and classifies the kind of mismatch between the antecedent and paraphrase according to thirteen criteria (e.g., tense mismatch, comparatives, inversion, split antecedents, inferred antecedent). In a similar effort, Bos & Spenader (2011) examined the entire WSJ portion of the Penn Treebank, focusing on modals and auxiliaries that "trigger" VPE. They find 580 instances of VPE and related phenomena, which they code for antecedent as well as: the morphosyntactic category of the antecedent, the trigger, and 34 strings connecting the antecedent and elision site.…”
Section: Related Workmentioning
confidence: 99%