2021
DOI: 10.1073/pnas.2014196118
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised neural network models of the ventral visual stream

Abstract: Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network mod… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

13
223
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 252 publications
(236 citation statements)
references
References 72 publications
13
223
0
Order By: Relevance
“…While the generalization capabilities in human recognition are still out of reach for networks with a feedforward architecture and supervised learning regime (Geirhos et al, 2019;Geirhos, Janssen, et al, 2018;Geirhos, Temme, et al, 2018), developing models that more closely match the human brain in terms of architecture (Evans et al, 2021;Kietzmann, Spoerer, et al, 2019) and learning rules (Zhuang et al, 2021) offer new perspectives for meeting this goal. Such models have yielded important insight in how the brain solves the general problem of object recognition and show improved generalization in some cases (Geirhos, Narayanappa, et al, 2020;Spoerer et al, 2017) but are still outmatched by humans (Geirhos, Meding, et al, 2020;Geirhos, Narayanappa, et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…While the generalization capabilities in human recognition are still out of reach for networks with a feedforward architecture and supervised learning regime (Geirhos et al, 2019;Geirhos, Janssen, et al, 2018;Geirhos, Temme, et al, 2018), developing models that more closely match the human brain in terms of architecture (Evans et al, 2021;Kietzmann, Spoerer, et al, 2019) and learning rules (Zhuang et al, 2021) offer new perspectives for meeting this goal. Such models have yielded important insight in how the brain solves the general problem of object recognition and show improved generalization in some cases (Geirhos, Narayanappa, et al, 2020;Spoerer et al, 2017) but are still outmatched by humans (Geirhos, Meding, et al, 2020;Geirhos, Narayanappa, et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…A fourth future direction is to develop new unsupervised learning algorithms that implement some of the core ideas of temporal contiguity learning, but are scaled to produce high-performing visual systems that are competitive with state-of-the-art neural network systems trained by full supervision. Many computational efforts have touched on this direction ( Agrawal et al, 2015 ; Bahroun and Soltoggio, 2017 ; Goroshin et al, 2014 ; Higgins et al, 2016 ; Kheradpisheh et al, 2016 ; Lotter et al, 2016 ; Srivastava et al, 2015 ; Wang and Gupta, 2015 ; Whitney et al, 2016 ), and some are just beginning to make predictions about the responses along the ventral stream ( Zhuang et al, 2021 ). A key next step will be to put those full-scale models to experimental test at both the neurophysiological and behavioral levels.…”
Section: Discussionmentioning
confidence: 99%
“…VisNet also shows how its type of learning can be performed without prejudging what is to be learned, and without providing a biologically implausible teacher for what the outputs of each neuron should be, which in contrast is assumed in HMAX and deep learning. Indeed, in deep learning with convolution networks the focus is still to categorize based on image properties (Rajalingham et al, 2018;Zhuang et al, 2021), rather than object properties that are revealed for example when objects transform in the world (Rolls, 2021a).…”
Section: Comparison Of Hmax With Visnetmentioning
confidence: 99%
“…Although progress has been made in unsupervised versions of deep convolutional neural networks trained with backpropagation of error ( Zhuang et al, 2021 ), the network still relies on image features to discriminate objects, and therefore will have problems with learning view invariant object representations to solve problems such as that illustrated in Figure 7 in which different views of an object have different image properties ( Robinson and Rolls, 2015 ). VisNet solves this and other aspects of invariant object recognition by using the statistics of the world captured by slow learning ( Robinson and Rolls, 2015 ).…”
Section: Transform-invariant Representations Of Objects and Facesmentioning
confidence: 99%