The LinGO Redwoods treebank motivation and preliminary applications

Oepen, Stephan; Toutanova, Kristina; Shieber, Stuart M.; Manning, Christopher D.; Flickinger, Dan; Brants, Thorsten

doi:10.3115/1071884.1071909

Cited by 59 publications

(72 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…We investigate these ideas via experiments in probabilistic parse selection from among a set of alternatives licensed by a hand-built grammar in the context of the newly developed Redwoods HPSG treebank [14]. HPSG (Head-driven Phrase Structure Grammar) is a modern constraint-based lexicalist (unification) grammar, described in [15].…”

Section: Methodsmentioning

confidence: 99%

Optimizing Local Probability Models for Statistical Parsing

Toutanova

Mitchell

Manning

2003

Machine Learning: ECML 2003

Self Cite

View full text Add to dashboard Cite

Abstract. This paper studies the properties and performance of models for estimating local probability distributions which are used as components of larger probabilistic systems -history-based generative parsing models. We report experimental results showing that memory-based learning outperforms many commonly used methods for this task (Witten-Bell, Jelinek-Mercer with fixed weights, decision trees, and log-linear models). However, we can connect these results with the commonly used general class of deleted interpolation models by showing that certain types of memory-based learning, including the kind that performed so well in our experiments, are instances of this class. In addition, we illustrate the divergences between joint and conditional data likelihood and accuracy performance achieved by such models, suggesting that smoothing based on optimizing accuracy directly might greatly improve performance.

show abstract

Section: Methodsmentioning

confidence: 99%

Optimizing Local Probability Models for Statistical Parsing

Toutanova

Mitchell

Manning

2003

Machine Learning: ECML 2003

Self Cite

View full text Add to dashboard Cite

show abstract

“…B&L use head-driven generative parsing strategies from sentential parsing (e. g., Collins 2003) to build sdrss for the Verbmobil appointment scheduling and travel planning dialogs that make up a large part of the Redwoods Treebank (Oepen et al 2002). An example dialog is that given in Figure 5.…”

Section: Discourse Parsing For Dialogmentioning

confidence: 99%

Annotation for and Robust Parsing of Discourse Structure on Unrestricted Texts

Baldridge¹,

Asher²,

Hunter³

2007

Zeitschrift Für Sprachwissenschaft

View full text Add to dashboard Cite

“…3 The ERG supports both parsing and generation, via the semantic formalism of Minimal Recursion Semantics ("MRS": Copestake et al (2005)). To generate paraphrases with the ERG, we simply parse a given input, select the preferred parse using a pretrained parse selection model (Oepen et al, 2002), and exhaustively generate from the resultant MRS. We then use uniform random sampling to select from the generator outputs, which potentially numbers in the thousands of variants. To handle unknown words during parsing and generation, we use POS mapping and introduce a unique relation for each unknown word, which we use to substitute the unknown word back in to the generation output.…”

Section: Generating Text Noisementioning

confidence: 99%

Robust Training under Linguistic Adversity

Li¹,

Cohn²,

Baldwin³

2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 2

View full text Add to dashboard Cite

Deep neural networks have achieved remarkable results across many language processing tasks, however they have been shown to be susceptible to overfitting and highly sensitive to noise, including adversarial attacks. In this work, we propose a linguistically-motivated approach for training robust models based on exposing the model to corrupted text examples at training time. We consider several flavours of linguistically plausible corruption, include lexical semantic and syntactic methods. Empirically, we evaluate our method with a convolutional neural model across a range of sentiment analysis datasets. Compared with a baseline and the dropout method, our method achieves better overall performance.

show abstract

The LinGO Redwoods treebank motivation and preliminary applications

Cited by 59 publications

References 14 publications

Optimizing Local Probability Models for Statistical Parsing

Optimizing Local Probability Models for Statistical Parsing

Annotation for and Robust Parsing of Discourse Structure on Unrestricted Texts

Robust Training under Linguistic Adversity

Contact Info

Product

Resources

About