2019
DOI: 10.48550/arxiv.1901.10668
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference

Shun Liao,
Ting Chen,
Tian Lin
et al.

Abstract: Computations for the softmax function are significantly expensive when the number of output classes is large. In this paper, we present a novel softmax inference speedup method, Doubly Sparse Softmax (DS-Softmax), that leverages sparse mixture of sparse experts to efficiently retrieve top-k classes. Different from most existing methods that require and approximate a fixed softmax, our method is learning-based and can adapt softmax weights for a better inference speedup. In particular, our method learns a two-l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 16 publications
0
1
0
Order By: Relevance
“…Thus, its embeddings are better than other models for sequences belonging to that subset but it also is not ignorant of the rest of the space. Such hybrid approaches have been used previously in machine learning ( Peralta et al 2019 , Liao et al 2019 ).…”
Section: Methodsmentioning
confidence: 99%
“…Thus, its embeddings are better than other models for sequences belonging to that subset but it also is not ignorant of the rest of the space. Such hybrid approaches have been used previously in machine learning ( Peralta et al 2019 , Liao et al 2019 ).…”
Section: Methodsmentioning
confidence: 99%