Convolutional neural networks explain tuning properties of anterior, but not middle, face-processing areas in macaque inferotemporal cortex

Raman, Rajani; Hosoya, Haruo

doi:10.1038/s42003-020-0945-x

Cited by 19 publications

(15 citation statements)

References 61 publications

(140 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As expected, the neurons of this model have no defined spatial organization and thus result in a random selectivity map. We note the existence of class-selective neurons is not guaranteed, but their appearance here is in-line with observations from prior work [38,50,2]. Secondly, we compare our TVAE model (middle) with our re-implementation of the TDANN [38] (right).…”

Section: Methodssupporting

confidence: 80%

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Keller¹,

Gao²,

Welling³

2021

Preprint

View full text Add to dashboard Cite

Category-selectivity in the brain describes the observation that certain spatially localized areas of the cerebral cortex tend to respond robustly and selectively to stimuli from specific limited categories. One of the most well known examples of category-selectivity is the Fusiform Face Area (FFA), an area of the inferior temporal cortex in primates which responds preferentially to images of faces when compared with objects or other generic stimuli. In this work, we leverage the newly introduced Topographic Variational Autoencoder to model of the emergence of such localized category-selectivity in an unsupervised manner. Experimentally, we demonstrate our model yields spatially dense neural clusters selective to faces, bodies, and places through visualized maps of Cohen's d metric. We compare our model with related supervised approaches, namely the TDANN, and discuss both theoretical and empirical similarities. Finally, we show preliminary results suggesting that our model yields a nested spatial hierarchy of increasingly abstract categories, analogous to observations from the human ventral temporal cortex.

show abstract

Section: Methodssupporting

confidence: 80%

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Keller¹,

Gao²,

Welling³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…A recent paper found the 2D Morphable Model, a classic model of face representation from computer vision, could explain neural activity in face patches remarkably well (Chang and Tsao, 2017). At the same time, a number of groups have found that activity in deep layers of convolutional neural networks can explain significant variance of neural responses in ventral temporal cortex (Yamins et al, 2014; Kalfas et al, 2017; Yildirim et al, 2020; Schrimpf et al, 2018; Raman and Hosoya, 2020). Here, we extend those results by comparing the efficacy of a large number of different computational models of face representation to account for neural activity in face patch AM.…”

Section: Discussionmentioning

confidence: 99%

Explaining face representation in the primate brain using different computational models

Chang

Egger

Vetter

et al. 2020

Preprint

View full text Add to dashboard Cite

Understanding how the brain represents the identity of complex objects is a central challenge of visual neuroscience. The principles governing object processing have been extensively studied in the macaque face patch system, a sub-network of inferotemporal (IT) cortex specialized for face processing (Tsao et al., 2006). A previous study reported that single face patch neurons encode axes of a generative model called the "active appearance" model (Chang and Tsao, 2017), which transforms 50-d feature vectors separately representing facial shape and facial texture into facial images (Cootes et al., 2001;Edwards et al., 1998). However, it remains unclear whether this model constitutes the best model for explaining face cell responses. Here, we recorded responses of cells in the most anterior face patch AM to a large set of real face images, and compared a large number of models for explaining neural responses. We found that the active appearance model better explained responses than any other model except CORnet-Z, a feedforward deep neural network trained on general object classification to classify non-face images, whose performance it tied on some face image sets and exceeded on others. Surprisingly, deep neural networks trained specifically on facial identification did not explain neural responses well. A major reason is that units in the network, unlike neurons, are less modulated by face-related factors unrelated to facial identification such as illumination.

show abstract

“…Here, we sought to overcome these limitations by adopting the underlying theoretical idea (wiring cost minimization), but building upon recent advances in ANN models (1,12,36). Notably, that prior deep ANN modeling work has already qualitatively demonstrated the presence of at least some "face neurons" within model IT (36) and more recent studies have demonstrated the existence of face-selective units in deep ANNs (37)(38)(39). However, the correspondence of face processing in ANNs and the primate ventral stream has not been tested systematically.…”

Section: Category Choicementioning

confidence: 99%

Topographic deep artificial neural networks reproduce the hallmarks of the primate inferior temporal cortex face processing network

Lee

Margalit

Jozwik

et al. 2020

Preprint

View full text Add to dashboard Cite

A salient characteristic of monkey inferior temporal (IT) cortex is the IT face processing network. Its hallmarks include: “face neurons” that respond more to faces than non-face objects, strong spatial clustering of those neurons in foci at each IT anatomical level (“face patches”), and the preferential interconnection of those foci. While some deep artificial neural networks (ANNs) are good predictors of IT neuronal responses, including face neurons, they do not explain those face network hallmarks. Here we ask if they might be explained with a simple, metabolically motivated addition to current ANN ventral stream models. Specifically, we designed and successfully trained topographic deep ANNs (TDANNs) to solve real-world visual recognition tasks (as in prior work), but, in addition, we also optimized each network to minimize a proxy for neuronal wiring length within its IT layers. We report that after this dual optimization, the model IT layers of TDANNs reproduce the hallmarks of the IT face network: the presence of face neurons, clusters of face neurons that quantitatively match those found in IT face patches, connectivity between those patches, and the emergence of face viewpoint invariance along the network hierarchy. We find that these phenomena emerge for a range of naturalistic experience, but not for highly unnatural training. Taken together, these results show that the IT face processing network could be a consequence of a basic hierarchical anatomy along the ventral stream, selection pressure on the visual system to accomplish general object categorization, and selection pressure to minimize axonal wiring length.

show abstract

Convolutional neural networks explain tuning properties of anterior, but not middle, face-processing areas in macaque inferotemporal cortex

Cited by 19 publications

References 61 publications

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Explaining face representation in the primate brain using different computational models

Topographic deep artificial neural networks reproduce the hallmarks of the primate inferior temporal cortex face processing network

Contact Info

Product

Resources

About