Dandona, Sangeet scite author profile

Dandona, Sangeet

2Publications

0Citation Statements Received

2Citation Statements Given

How they've been cited

How they cite others

Affiliations

Georgia Institute of Technology

Publications

Order By: Most citations

SHAQ: Single Headed Attention with Quasi-Recurrence

Nashwin¹,

Kushner²,

Sangeet³

et al. 2021

Preprint

View full text Add to dashboard Cite

Natural Language Processing research has recently been dominated by large scale transformer models. Although they achieve state of the art on many important language tasks, transformers often require expensive compute resources, and days spanning to weeks to train. This is feasible for researchers at big tech companies and leading research universities, but not for scrappy start-up founders, students, and independent researchers. Stephen Merity's SHA-RNN, a compact, hybrid attention-RNN model, is designed for consumer-grade modeling as it requires significantly fewer parameters and less training time to reach near state of the art results. We analyze Merity's model here through an exploratory model analysis over several units of the architecture considering both training time and overall quality in our assessment. Ultimately, we combine these findings into a new architecture which we call SHAQ: Single Headed Attention Quasi-recurrent Neural Network. With our new architecture we achieved similar accuracy results as the SHA-RNN while accomplishing a 4x speed boost in training.

show abstract

SHAQ: Single Headed Attention with Quasi-recurrence

Nashwin

Kushner

Sangeet

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dandona, Sangeet

SHAQ: Single Headed Attention with Quasi-Recurrence

SHAQ: Single Headed Attention with Quasi-recurrence

Contact Info

Product

Resources

About