Anshul Nasery scite author profile

Anshul Nasery

5Publications

10Citation Statements Received

93Citation Statements Given

How they've been cited

How they cite others

Affiliations

Indian Institute of Technology Bombay

Publications

Order By: Most citations

MIMOQA: Multimodal Input Multimodal Output Question Answering

Singh¹,

Nasery²,

Mehta³

et al. 2021

View full text Add to dashboard Cite

Multimodal research has picked up significantly in the space of question answering with the task being extended to visual question answering, charts question answering as well as multimodal input question answering. However, all these explorations produce a unimodal textual output as the answer. In this paper, we propose a novel task -MIMOQA -Multimodal Input Multimodal Output Question Answering in which the output is also multimodal. Through human experiments, we empirically show that such multimodal outputs provide better cognitive understanding of the answers. We also propose a novel multimodal question-answering framework, MExBERT, that incorporates a joint textual and visual attention towards producing such a multimodal output. Our method relies on a novel multimodal dataset curated for this problem from publicly available unimodal datasets. We show the superior performance of MExBERT against strong baselines on both the automatic as well as human metrics.

show abstract

Rule Augmented Unsupervised Constituency Parsing

Sahay¹,

Nasery²,

Maheshwari³

et al. 2021

View full text Add to dashboard Cite

Recently, unsupervised parsing of syntactic trees has gained considerable attention. A prototypical approach to such unsupervised parsing employs reinforcement learning and auto-encoders. However, no mechanism ensures that the learnt model leverages the wellunderstood language grammar. We propose an approach that utilizes very generic linguistic knowledge of the language present in the form of syntactic rules, thus inducing better syntactic structures. We introduce a novel formulation that takes advantage of the syntactic grammar rules and is independent of the base system. We achieve new state-of-the-art results on two benchmarks datasets, MNLI and WSJ. 1

show abstract

Teaching CNNs to Mimic Human Visual Cognitive Process & Regularise Texture-Shape Bias

Mohla

Nasery

Banerjee

2022

View full text Add to dashboard Cite

Rule Augmented Unsupervised Constituency Parsing

Sahay¹,

Nasery²,

Maheshwari³

et al. 2021

Preprint

View full text Add to dashboard Cite

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks

Addepalli¹,

Nasery²,

Babu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep Neural Networks are known to be brittle to even minor distribution shifts compared to the training distribution. While one line of work has demonstrated that Simplicity Bias (SB) of DNNs -bias towards learning only the simplest features -is a key reason for this brittleness, another recent line of work has surprisingly found that diverse/ complex features are indeed learned by the backbone, and their brittleness is due to the linear classification head relying primarily on the simplest features. To bridge the gap between these two lines of work, we first hypothesize and verify that while SB may not altogether preclude learning complex features, it amplifies simpler features over complex ones. Namely, simple features are replicated several times in the learned representations while complex features might not be replicated. This phenomenon, we term Feature Replication Hypothesis, coupled with the Implicit Bias of SGD to converge to maximum margin solutions in the feature space, leads the models to rely mostly on the simple features for classification. To mitigate this bias, we propose Feature Reconstruction Regularizer (FRR) to ensure that the learned features can be reconstructed back from the logits. The use of FRR in linear layer training (FRR-L) encourages the use of more diverse features for classification. We further propose to finetune the full network by freezing the weights of the linear layer trained using FRR-L, to refine the learned features, making them more suitable for classification. Using this simple solution, we demonstrate up to 15% gains in OOD accuracy on the recently introduced semi-synthetic datasets with extreme distribution shifts. Moreover, we demonstrate noteworthy gains over existing SOTA methods on the standard OOD benchmark DomainBed as well.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anshul Nasery

MIMOQA: Multimodal Input Multimodal Output Question Answering

Rule Augmented Unsupervised Constituency Parsing

Teaching CNNs to Mimic Human Visual Cognitive Process & Regularise Texture-Shape Bias

Rule Augmented Unsupervised Constituency Parsing

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks

Contact Info

Product

Resources

About