2020
DOI: 10.1109/tip.2020.3005508
|View full text |Cite
|
Sign up to set email alerts
|

Biased Mixtures of Experts: Enabling Computer Vision Inference Under Data Transfer Limitations

Abstract: We propose a novel mixture-of-experts class to optimize computer vision models in accordance with data transfer limitations at test time. Our approach postulates that the minimum acceptable amount of data allowing for highly-accurate results can vary for different input space partitions. Therefore, we consider mixtures where experts require different amounts of data, and train a sparse gating function to divide the input space for each expert. By appropriate hyperparameter selection, our approach is able to bi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 44 publications
0
4
0
Order By: Relevance
“…Finally, we add a small amount of noise with standard deviation 1 E to the activations Wx, which we find improves performance. We empirically found this performed well but that the setup was robust to this parameter.…”
Section: Routingmentioning
confidence: 95%
See 3 more Smart Citations
“…Finally, we add a small amount of noise with standard deviation 1 E to the activations Wx, which we find improves performance. We empirically found this performed well but that the setup was robust to this parameter.…”
Section: Routingmentioning
confidence: 95%
“…For each MoE layer in V-MoE, we use the routing function g(x) = TOP k (softmax (Wx + )), where TOP k is an operation that sets all elements of the vector to zero except the elements with the largest k values, and is sampled independently ∼ N (0, 1 E 2 ) entry-wise. In practice, we use k = 1 or k = 2.…”
Section: Routingmentioning
confidence: 99%
See 2 more Smart Citations