2016
DOI: 10.1038/srep27755
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence

Abstract: The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

65
647
3

Year Published

2016
2016
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 682 publications
(715 citation statements)
references
References 58 publications
65
647
3
Order By: Relevance
“…A CNN is not designed to be biologically plausible, although we may draw a loose analogies between a CNN and the human cortex about their processing stages and representations (VanRullen, 2017). Comparative fMRI studies show that representations in the CNN correlate with emerging visual representations in the human brain (Cichy, Khosla, Pantazis, & Oliva, 2017;Cichy, Khosla, Pantazis, Torralba, & Oliva, 2016). Semantic similarity corresponds to more processed visual representations found in final CNN layers (in our case, fc7) or the inferotemporal cortex in the brain, whereas perceptual similarity is based on low-level representations in lower layers of a CNN or in the visual cortex.…”
Section: Discussionmentioning
confidence: 99%
“…A CNN is not designed to be biologically plausible, although we may draw a loose analogies between a CNN and the human cortex about their processing stages and representations (VanRullen, 2017). Comparative fMRI studies show that representations in the CNN correlate with emerging visual representations in the human brain (Cichy, Khosla, Pantazis, & Oliva, 2017;Cichy, Khosla, Pantazis, Torralba, & Oliva, 2016). Semantic similarity corresponds to more processed visual representations found in final CNN layers (in our case, fc7) or the inferotemporal cortex in the brain, whereas perceptual similarity is based on low-level representations in lower layers of a CNN or in the visual cortex.…”
Section: Discussionmentioning
confidence: 99%
“…These models, which in essence are a generalized form of principles first proposed by Hubel and Wiesel, have garnered increasing attention because they approach human levels of performance on tasks such as object recognition for real-world images [129,130] and show a degree of correspondence between individual layers of the networks and different levels of neural visual object processing [131133]. CNNs that perform better on object-recognition tasks are found to be better predictors of neuronal spiking data in monkey inferotemporal cortex [134], suggesting the importance of a goal driven approach for creating a model of a given sensory system.…”
Section: Figurementioning
confidence: 99%
“…MEG/EEG studies have furthermore shown that early layers of DNNs have a peak explained variance that is earlier than higher-tier DNN layers (Cichy, Khosla, Pantazis, Torralba, & Oliva, 2016;Ramakrishnan, Scholte, Smeulders, & Ghebreab, 2016). In addition, the DNN model has been shown to predict neural responses in IT, both from humans and macaque, much better than any other computational model (Khaligh-Razavi & Kriegeskorte, 2014;Yamins et al, 2014).…”
mentioning
confidence: 99%