2019
DOI: 10.3390/e21100924
|View full text |Cite
|
Sign up to set email alerts
|

Learnability for the Information Bottleneck

Abstract: The Information Bottleneck (IB) method (Tishby et al. (2000)) provides an insightful and principled approach for balancing compression and prediction for representation learning. The IB objective I(X; Z) − βI(Y ; Z) employs a Lagrange multiplier β to tune this trade-off. However, in practice, not only is β chosen empirically without theoretical guidance, there is also a lack of theoretical understanding between β, learnability, the intrinsic nature of the dataset and model capacity. In this paper, we show that… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
43
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2
2

Relationship

0
6

Authors

Journals

citations
Cited by 29 publications
(46 citation statements)
references
References 19 publications
3
43
0
Order By: Relevance
“…The range of the Lagrange multipliers that allow the exploration of the IB curve is contained by which is also contained by , where where is the derivative of w.r.t. evaluated at r, is the set of possible realizations of X and and are defined as in [ 27 ] (Note in [ 27 ] they consider the dual problem (see Appendix G ), so when they refer to it translates to β in this article). That is, .…”
Section: The Convex Ib Lagrangianmentioning
confidence: 99%
See 3 more Smart Citations
“…The range of the Lagrange multipliers that allow the exploration of the IB curve is contained by which is also contained by , where where is the derivative of w.r.t. evaluated at r, is the set of possible realizations of X and and are defined as in [ 27 ] (Note in [ 27 ] they consider the dual problem (see Appendix G ), so when they refer to it translates to β in this article). That is, .…”
Section: The Convex Ib Lagrangianmentioning
confidence: 99%
“…Corollaries 2 and 3 allow us to reduce the range search for when we want to explore the IB curve. Practically, might be difficult to calculate so Wu et al [ 27 ] derived an algorithm to approximate it. However, we still recommend setting the numerator to 1 for simplicity.…”
Section: The Convex Ib Lagrangianmentioning
confidence: 99%
See 2 more Smart Citations
“…This demonstrates the existence of a critical β for each predictive coding scheme, above which m needs to be increased to extract more predictive information and below which additional values of the representation variable encode redundant portions of allele frequency space. While we do not estimate the critical β, approaches to estimating them are presented in [42,43].…”
Section: Evolutionary Dynamicsmentioning
confidence: 99%