2021
DOI: 10.48550/arxiv.2111.06832
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Speeding Up Entmax

Abstract: Softmax is the de facto standard in modern neural networks for language processing when it comes to normalizing logits. However, by producing a dense probability distribution each token in the vocabulary has a nonzero chance of being selected at each generation step, leading to a variety of reported problems in text generation. α-entmax of Peters et al. ( 2019) solves this problem, but is considerably slower than softmax.In this paper, we propose an alternative to αentmax, which keeps its virtuous characterist… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 6 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?