Towards Semi-Supervised Learning for Deep Semantic Role Labeling

Mehta, Sanket Vaibhav; Lee, Jay Yoon; Carbonell, Jaime G.

doi:10.18653/v1/d18-1538

Cited by 17 publications

(9 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For BIO tags, the set V comprises all taggings that don't include an I after an O, and the maximization problem can be solved in linear time using Viterbi decoding (Viterbi, 1967) as in Yao et al (2013); Mehta et al (2018). For IO tags, all taggings are valid, and maximization is done by predicting the tag with highest probability in each token independently.…”

Section: Decoding Spans From a Taggingmentioning

confidence: 99%

A Simple and Effective Model for Answering Multi-span Questions

Segal¹,

Efrat²,

Shoham³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Models for reading comprehension (RC) commonly restrict their output space to the set of all single contiguous spans from the input, in order to alleviate the learning problem and avoid the need for a model that generates text explicitly. However, forcing an answer to be a single span can be restrictive, and some recent datasets also include multi-span questions, i.e., questions whose answer is a set of non-contiguous spans in the text. Naturally, models that return single spans cannot answer these questions. In this work, we propose a simple architecture for answering multi-span questions by casting the task as a sequence tagging problem, namely, predicting for each input token whether it should be part of the output or not. Our model substantially improves performance on span extraction questions from DROP and QUOREF by 9.9 and 5.5 EM points respectively.

show abstract

Section: Decoding Spans From a Taggingmentioning

confidence: 99%

A Simple and Effective Model for Answering Multi-span Questions

Segal¹,

Efrat²,

Shoham³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…Accuracy in each of the three tasks was improved by respecting constraints. Additionally, for SRL, we employed GBI on a model trained with similar constraint enforcing loss as GBI's (Mehta*, Lee*, and Carbonell 2018), and observe that the additional test-time optimization of GBI still significantly improves the model output whereas A * does not. We believe this is because GBI searches in the proximity of the provided model weights; however, theoretical analysis of this hypothesis is left as a future work.…”

Section: Discussionmentioning

confidence: 99%

Gradient-Based Inference for Networks with Output Constraints

Lee

Mehta

Wick

et al. 2019

AAAI

Self Cite

View full text Add to dashboard Cite

Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for each input at test-time, we nudge continuous model weights until the network's unconstrained inference procedure generates an output that satisfies the constraints. We study the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction. In each case, the algorithm not only satisfies constraints, but improves accuracy, even when the underlying network is stateof-the-art.

show abstract

“…This field contains a plethora of different neural symbolic methods and techniques. The methods that closely relate to our line of work seek to enforce constraints on the output of a neural network (Hu et al, 2016;Donadello et al, 2017;Diligenti et al, 2017;Mehta et al, 2018;Xu et al, 2018;Nandwani et al, 2019). For a more in-depth introduction, we refer the reader to these excellent recent surveys: Besold et al (2017) andDe Raedt et al (2020).…”

Section: Related Workmentioning

confidence: 99%

Using Domain Knowledge to Guide Dialog Structure Induction via Neural Probabilistic Soft Logic

Pryor¹,

Yuan²,

Liu³

et al. 2023

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

Dialog Structure Induction (DSI) is the task of inferring the latent dialog structure (i.e., a set of dialog states and their temporal transitions) of a given goal-oriented dialog. It is a critical component for modern dialog system design and discourse analysis. Existing DSI approaches are often purely data-driven, deploy models that infer latent states without access to domain knowledge, underperform when the training corpus is limited/noisy, or have difficulty when test dialogs exhibit distributional shifts from the training domain. This work explores a neural-symbolic approach as a potential solution to these problems. We introduce Neural Probabilistic Soft Logic Dialogue Structure Induction (NEUPSL DSI), a principled approach that injects symbolic knowledge into the latent space of a generative neural model. We conduct a thorough empirical investigation on the effect of NEUPSL DSI learning on hidden representation quality, few-shot learning, and out-of-domain generalization performance. Over three dialog structure induction datasets and across unsupervised and semi-supervised settings for standard and cross-domain generalization, the injection of symbolic knowledge using NEUPSL DSI provides a consistent boost in performance over the canonical baselines.

show abstract

Towards Semi-Supervised Learning for Deep Semantic Role Labeling

Cited by 17 publications

References 10 publications

A Simple and Effective Model for Answering Multi-span Questions

A Simple and Effective Model for Answering Multi-span Questions

Gradient-Based Inference for Networks with Output Constraints

Using Domain Knowledge to Guide Dialog Structure Induction via Neural Probabilistic Soft Logic

Contact Info

Product

Resources

About