2019
DOI: 10.1101/555193
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Orthogonal Representations of Object Shape and Category in Deep Convolutional Neural Networks and Human Visual Cortex

Abstract: 1Deep Convolutional Neural Networks (CNNs) are gaining traction as the benchmark model of 2 visual object recognition, with performance now surpassing humans. While CNNs can accurately 3 assign one image to potentially thousands of categories, network performance could be the result 4 of layers that are tuned to represent the visual shape of objects, rather than object category, since 5 both are often confounded in natural images. Using two stimulus sets that explicitly dissociate 6 shape from category, we cor… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
34
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 20 publications
(34 citation statements)
references
References 39 publications
0
34
0
Order By: Relevance
“…Most network architectures consist of numerous convolutional layers, which tend to abstract local feature information when trained on large stimulus sets, followed by a few fully connected (FC) layers that more closely mirror the category structure in the training set. Several studies have reported correlations between FC layers and category-selective areas in human OTC suggesting that the animacy division is represented in these networks when trained on natural images (Bracci et al 2019;Jozwik et al 2017; Khaligh-Razavi and Kriegeskorte, 2014;Zeman et al 2020). However, these studies did not include a taxonomic hierarchy in their stimulus designs.…”
Section: Face-body and Taxonomy But Not Animacy Models Explain Actimentioning
confidence: 99%
See 1 more Smart Citation
“…Most network architectures consist of numerous convolutional layers, which tend to abstract local feature information when trained on large stimulus sets, followed by a few fully connected (FC) layers that more closely mirror the category structure in the training set. Several studies have reported correlations between FC layers and category-selective areas in human OTC suggesting that the animacy division is represented in these networks when trained on natural images (Bracci et al 2019;Jozwik et al 2017; Khaligh-Razavi and Kriegeskorte, 2014;Zeman et al 2020). However, these studies did not include a taxonomic hierarchy in their stimulus designs.…”
Section: Face-body and Taxonomy But Not Animacy Models Explain Actimentioning
confidence: 99%
“…Third, we investigated whether face and body morphology -in particular, the relative similarity of animal faces and bodies to those of humans -might better explain the relationship between activity patterns than taxonomy. Finally, we assessed whether both factors might be reflected in the patterns of activation weights of layers of multiple deep neural networks (DNNs), which some studies suggest also distinguish the animacy of objects (Bracci et al 2019;Jozwik et al 2017; Khaligh-Razavi and Kriegeskorte, 2014;Zeman et al 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Despite being developed as computer vision tools, DNNs trained to recognise objects in images are also unsurpassed at predicting how natural images are represented in high-level ventral visual areas of the human and non-human primate brain (Agrawal et al, 2014;Bashivan et al, 2019;Cadieu et al, 2014;Cichy et al, 2016;Devereux et al, 2018;Eickenberg et al, 2017;Güçlü & van Gerven, 2015;Horikawa & Kamitani, 2017;Kubilius et al, 2018;Lindsay, 2020;Ponce et al, 2019;Schrimpf et al, 2018;Xu & Vaziri-Pashkam, 2020;Yamins & DiCarlo, 2016). There is some variability in the accuracy with which different recent DNNs can predict high-level visual representations Xu & Vaziri-Pashkam, 2020;Zeman et al, 2020), despite broadly high performance. It remains unclear how strongly network design choices, such as depth, architecture, task training, and subsequent model fitting to neural data may contribute to the observed variations.…”
Section: Introductionmentioning
confidence: 99%
“…Agrawal et al, 2014;Bashivan et al, 2019;Cadieu et al, 2014;Güçlü & van Gerven, 2015;Horikawa & Kamitani, 2017;Ponce et al, 2019;, while others treat the representations within a layer of a network as fixed (e.g. Truzzi & Cusack, 2020;Zeman et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation