Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories

Prange, Jakob; Schneider, Nathan; Srikumar, Vivek

doi:10.1162/tacl_a_00364

Cited by 11 publications

(14 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A key aspect of our parser is that it makes use of a structured decomposition of lexical categories in categorial grammars. In this sense, our work follows up on the intuition of recent "constructive" supertaggers, which have been explored for a type-logical grammar (Kogkalidis et al, 2019) and for CCG (Bhargava and Penn, 2020;Prange et al, 2021). Such supertaggers construct categories out of the atomic categories of the grammar; this challenges the classical approach to supertagging, where lexical categories are treated as opaque, rendering the task of supertagging equivalent to large-tagset POS tagging.…”

Section: Related Workmentioning

confidence: 99%

Proof Net Structure for Neural Lambek Categorial Parsing

Bhargava

Penn

2021

Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing Into Enhanced

View full text Add to dashboard Cite

In this paper, we present the first statistical parser for Lambek categorial grammar (LCG), a grammatical formalism for which the graphical proof method known as proof nets is applicable. Our parser incorporates proof net structure and constraints into a system based on selfattention networks via novel model elements. Our experiments on an English LCG corpus show that incorporating term graph structure is helpful to the model, improving both parsing accuracy and coverage. Moreover, we derive novel loss functions by expressing proof net constraints as differentiable functions of our model output, enabling us to train our parser without ground-truth derivations.

show abstract

Section: Related Workmentioning

confidence: 99%

Proof Net Structure for Neural Lambek Categorial Parsing

Bhargava

Penn

2021

Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing Into Enhanced

View full text Add to dashboard Cite

show abstract

“…By convention, the model is limited to predicting only tags that appeared at least 10 times in the training data, yielding 425 tags + the UNK tag. We use the non-constructive BERT-based (Devlin et al, 2019) model from (Prange et al, 2021) with its default hyperparameters. The tagger was trained on 927,497 tokens and obtained a dev accuracy of 96.1%.…”

Section: Ccg Supertaggingmentioning

confidence: 99%

“…Many linguistic phenomena follow power law distributions and thus feature a long tail of individually rare events, which, as we will show, makes it nontrivial to measure calibration error with existing methods, including marginal calibration error (MCE), which requires sufficient samples of each class to produce a reliable estimate (Kumar et al, 2019). We evaluate two English sentence taggers 1 with closed sets of 100s of tags that disambiguate word tokens: a Combinatory Categorial Grammar (CCG) syntactic supertagger with 426 tags (Prange et al, 2021), and a Lexical Semantic Recognition (LSR) tagger with 598 tags (Liu et al, 2021).…”

Section: Introductionmentioning

confidence: 99%

Making Heads and Tails of Models with Marginal Calibration for Sparse Tagsets

Kranzlein¹,

Liu²,

Schneider³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

For interpreting the behavior of a probabilistic model, it is useful to measure a model's calibration-the extent to which the model produces reliable confidence scores. We address the open problem of calibration for tagging models with sparse tagsets, and recommend strategies to measure and reduce calibration error (CE) in such models. We show that several post-hoc recalibration techniques all reduce calibration error across the marginal distribution for two existing sequence taggers. Moreover, we propose tag frequency grouping (TFG) as a way to measure calibration error in different frequency bands. Further, recalibrating each group separately promotes a more equitable reduction of calibration error across the tag frequency spectrum.1 Data, code, and results are available at https://github. com/nert-nlp/calibration _ tfg. Hyperparameters are described in §4.2.

show abstract

“…Combinatory Categorial Grammar (CCG) (Steedman, 2000) is a mildly context-sensitive grammar formalism. Several neural CCG parsing methods have been proposed so far (Lewis and Steedman, 2014;Xu et al, 2015;Vaswani et al, 2016;Xu, 2016;Yoshikawa et al, 2017;Steedman, 2019, 2020;Bhargava and Penn, 2020;Tian et al, 2020;Prange et al, 2021;Liu et al, 2021). Currently, neural span-based models (Cross and Huang, 2016;Stern et al, 2017;Gaddy et al, 2018;Kitaev and Klein, 2018) have been successful in the field of constituency parsing.…”

Section: Introductionmentioning

confidence: 99%

“…Furthermore, as a by-product of our representation, the parsing models can assign outof-vocabulary (OOV) categories, which have not appeared in training data. This characteristic has been attracting attention in CCG parsing research (Bhargava and Penn, 2020;Prange et al, 2021;Liu et al, 2021). Our experimental result shows that an off-the-shelf span-based parser with our representation is comparable with previous CCG parsers and can generate correct OOV categories.…”

Section: Introductionmentioning

confidence: 99%

A New Representation for Span-based CCG Parsing

Kato¹,

Matsubara²

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

This paper proposes a new representation for CCG derivations. CCG derivations are represented as trees whose nodes are labeled with categories strictly restricted by CCG rule schemata. This characteristic is not suitable for span-based parsing models because they predict node labels independently. In other words, span-based models may generate invalid CCG derivations that violate the rule schemata. Our proposed representation decomposes CCG derivations into several independent pieces and prevents the span-based parsing models from violating the schemata. Our experimental result shows that an off-theshelf span-based parser with our representation is comparable with previous CCG parsers.

show abstract

Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories

Cited by 11 publications

References 30 publications

Proof Net Structure for Neural Lambek Categorial Parsing

Proof Net Structure for Neural Lambek Categorial Parsing

Making Heads and Tails of Models with Marginal Calibration for Sparse Tagsets

A New Representation for Span-based CCG Parsing

Contact Info

Product

Resources

About