Frivolous Units: Wider Networks Are Not Really That Wide

Casper, Stephen T.; Boix, Xavier; D’Amario, Vanessa; Schrimpf, Martin; Vinken, Kasper; Kreiman, Gabriel

doi:10.48550/arxiv.1912.04783

Cited by 2 publications

(3 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is worth mentioning that wide natural models exhibit a larger number of redundant feature maps compared to their thinner counterparts. This is consistent with the results in [48], where the existence of so-called redundant units is proved and leveraged to explain implicit regularization in wide (natural) models. Our findings suggest a novel direction for investigation about the mechanism through which local robustness may be implemented by adversarially-trained CNNs, namely a coupling between feature maps.…”

Section: Feature Maps Redundancysupporting

confidence: 90%

On the Properties of Adversarially-Trained CNNs

Carletti¹,

Terzi²,

Susto³

2022

Preprint

View full text Add to dashboard Cite

Adversarial Training has proved to be an effective training paradigm to enforce robustness against adversarial examples in modern neural network architectures. Despite many efforts, explanations of the foundational principles underpinning the effectiveness of Adversarial Training are limited and far from being widely accepted by the Deep Learning community. In this paper, we describe surprising properties of adversarially-trained models, shedding light on mechanisms through which robustness against adversarial attacks is implemented. Moreover, we highlight limitations and failure modes affecting these models that were not discussed by prior works. We conduct extensive analyses on a wide range of architectures and datasets, performing a deep comparison between robust and natural models.

show abstract

Section: Feature Maps Redundancysupporting

confidence: 90%

On the Properties of Adversarially-Trained CNNs

Carletti¹,

Terzi²,

Susto³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…And as discussed in Section IV-C, networks can also be interpreted by partitioning them into modules and studying each separately. Furthermore, "frivolous" neurons [41] are compressible and include sets of redundant neurons which can be interpreted as modules and can often be merged by weight refactorization. And finally, compression can guide interpretations [152], and interpretations can guide compression [266] (see Section III-G on frivolous neurons and compression).…”

Section: Discussionmentioning

confidence: 99%

“…Frivolous neurons are not important to a network. [41] define and detect two distinct types in DNNs: prunable neurons which can be removed from a network by ablation, and redundant neurons which can be removed by refactoring layers. They pose a challenge for interpretability because a frivolous neuron's contribution to the network may either be meaningless or difficult to detect with certain methods (e.g.…”

Section: G Frivolous Neurons (Hazard)mentioning

confidence: 99%

Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks

Tilman¹,

Ho²,

Casper³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

The last decade of machine learning has seen drastic increases in scale and capabilities, and deep neural networks (DNNs) are increasingly being deployed across a wide range of domains. However, the inner workings of DNNs are generally difficult to understand, raising concerns about the safety of using these systems without a rigorous understanding of how they function. In this survey, we review literature on techniques for interpreting the inner components of DNNs, which we call inner interpretability methods. Specifically, we review methods for interpreting weights, neurons, subnetworks, and latent representations with a focus on how these techniques relate to the goal of designing safer, more trustworthy AI systems. We also highlight connections between interpretability and work in modularity, adversarial robustness, continual learning, network compression, and studying the human visual system. Finally, we discuss key challenges and argue for future work in interpretability for AI safety that focuses on diagnostics, benchmarking, and robustness.

show abstract

Frivolous Units: Wider Networks Are Not Really That Wide

Cited by 2 publications

References 0 publications

On the Properties of Adversarially-Trained CNNs

On the Properties of Adversarially-Trained CNNs

Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks

Contact Info

Product

Resources

About