2018 12th International Conference on Signal Processing and Communication Systems (ICSPCS) 2018
DOI: 10.1109/icspcs.2018.8631758
|View full text |Cite
|
Sign up to set email alerts
|

An Information Theoretic View on Learning of Artificial Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 12 publications
0
7
0
Order By: Relevance
“…The convolutional neural network could also be investigated regarding whether it satisfies DPI or otherwise. In the case of a feed-forward neural network, the Markovian structure and data processing inequalities across layers are generally accepted [ 10 , 12 ]. In a previous study [ 9 ], it was stated that this could also be seen in CNN networks, despite the calculation limits.…”
Section: Experiments and Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…The convolutional neural network could also be investigated regarding whether it satisfies DPI or otherwise. In the case of a feed-forward neural network, the Markovian structure and data processing inequalities across layers are generally accepted [ 10 , 12 ]. In a previous study [ 9 ], it was stated that this could also be seen in CNN networks, despite the calculation limits.…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…In addition to these, because the selection of hyperparameters, such as the learning speed and batch size is also intuitive and has no transparent algorithmic structure, these networks are not reproducible. Thus, efforts [ 8 , 12 ] are being made to make DNNs more understandable and transparent.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Shwartz-Ziv and Tishby [10]'s information bottleneck theory approach as a theoretical basis for explaining deep neural network learning has triggered much controversy. These discussions have focused on attributing the causes of the observed compression phase to stochasticity in training [10,38,39], the specific activation function used [12], initialisation of models [40] and the method used for estimating mutual information [41]. Fundamental to much of the discussion in recent literature on this subject is the difficulty of estimating mutual information in high dimensional datasets, which requires accurate models of the probability distributions.…”
Section: Discussionmentioning
confidence: 99%