Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Miladinović, Đorđe; Shridhar, Kumar; Jain, Kushal; Paulus, Max B.; Buhmann, Joachim M.; Allen, Carl E.

doi:10.48550/arxiv.2209.12590

Cited by 2 publications

(2 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…27) can be in an adversarial form. Miladinović et al [32] demonstrated the potential of learning a dropout model via a GAN-like formulation. We modify the objective in an analogous way by creating a max-min game between the Selector and the predictor:…”

Section: Objective Functionsmentioning

confidence: 99%

A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies

Hè¹,

Kabic²

2023

Preprint

View full text Add to dashboard Cite

Nomenclature ΣCovariance matrix G Gram/kernel matrix k(•)Kernel function P(•) Probability density P(•)Token mixing process Re(•) Function that extracts the real component of a complex numberElement at ith position of column vector a A * :jColumn vector in jth row of A A i,jElement in ith row jth column ofmatrix of the embedding dimension F s L×L Vandermonde matrix of the sequence dimension W Weight matix learned with element-wise non-linearity (e.g., ReLU, GELU) W C L×L Weight matix of a single convolution kernel W K D×N Weight matix of attention key (for self-attention, N = M ) W Q D×M Weight matix of attention query W V D×M Weight matix of attention value X Resulting tokens with inductive bias introduced into X X L×D Input sequence of length L and embedding dimension D, where L D * Correspondence to

show abstract

Section: Objective Functionsmentioning

confidence: 99%

A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies

Hè¹,

Kabic²

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Often these single-hop questions are combined to form a multi-hop question that requires complex reasoning to solve it (Pan et al, 2021). Controllable text generation has been studied in the past for text generation (Hu et al, 2017;Miladinović et al, 2022;Carlsson et al, 2022), Wikipedia texts (Liu et al, 2018Prabhumoye et al, 2018) and data-to-text generation (Puduppully and Lapata, 2021;Su et al, 2021). Controlled text generation is particularly useful for ensuring that the information is correct or the numbers are encapsulated properly (Gong et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Shridhar¹,

Jakub²,

El‐Assady³

et al. 2022

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Socratic questioning is an educational method that allows students to discover answers to complex problems by asking them a series of thoughtful questions. Generation of didactically sound questions is challenging, requiring understanding of the reasoning process involved in the problem. We hypothesize that such questioning strategy can not only enhance the human performance, but also assist the math word problem (MWP) solvers. In this work, we explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. We propose various guided question generation schemes based on input conditioning and reinforcement learning. On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver. We conduct a preliminary user study to examine the potential value of such question generation models in the education domain. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance. We discuss the future of using such questioning strategies in education.https://github.com/eth-nlped/ scaffolding-generation

show abstract

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Cited by 2 publications

References 0 publications

A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies

A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Contact Info

Product

Resources

About