Prajjwal Bhargava scite author profile

Prajjwal Bhargava

5Publications

23Citation Statements Received

97Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Bhargava¹,

Drozd²,

Rogers³

2021

View full text Add to dashboard Cite

Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

show abstract

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Bhargava¹,

Drozd²,

Rogers³

2021

Preprint

View full text Add to dashboard Cite

Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey

Bhargava¹,

Ng²

2022

Preprint

View full text Add to dashboard Cite

Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey

Bhargava¹,

Ng²

2022

AAAI

View full text Add to dashboard Cite

While commonsense knowledge acquisition and reasoning has traditionally been a core research topic in the knowledge representation and reasoning community, recent years have seen a surge of interest in the natural language processing community in developing pre-trained models and testing their ability to address a variety of newly designed commonsense knowledge reasoning and generation tasks. This paper presents a survey of these tasks, discusses the strengths and weaknesses of state-of-the-art pre-trained models for commonsense reasoning and generation as revealed by these tasks, and reflects on future research directions.

show abstract

Adaptive Transformers for Learning Multimodal Representations

Bhargava¹

2020

View full text Add to dashboard Cite

The usage of transformers has grown from learning about language semantics to forming meaningful visiolinguistic representations.These architectures are often over-parametrized, requiring large amounts of computation. In this work, we extend adaptive approaches to learn more about model interpretability and computational efficiency. Specifically, we study attention spans, sparse, and structured dropout methods to help understand how their attention mechanism extends for vision and language tasks. We further show that these approaches can help us learn more about how the network perceives the complexity of input sequences, sparsity preferences for different modalities, and other related phenomena.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.