Capturing human categorization of natural images by combining deep networks and cognitive models

Battleday, Ruairidh M.; Peterson, Joshua C.; Griffiths, Thomas L.

doi:10.1038/s41467-020-18946-z

Cited by 73 publications

(115 citation statements)

References 59 publications

Supporting

Mentioning

111

Contrasting

Order By: Relevance

“…8), shedding light on a 40-year-old debate 5 . These predictions are consistent with a recent demonstration that a prototype-based rule can match the performance of an exemplar model on categorization of familiar high dimensional stimuli 58 . We go beyond prior work by (1) demonstrating that prototype learning achieves superior performance on fewshot learning of novel naturalistic concepts, (2) precisely characterizing the tradeoff as a joint function of concept manifold dimensionality and the number of training examples (Fig.…”

Section: Discussionsupporting

confidence: 90%

“…Such increased model complexity raises foundational questions about the appropriate comparisons between brains and machine based models 55 . Previous approaches based on behavioral performance 16,25,[56][57][58] , neuron 46 or circuit 49 matching, linear regression between representations 22 , or representational similarity analysis 24 , reveal a reasonable match between the two. However, our higher-resolution decomposition of performance into a fundamental set of observable geometric properties reveals significant mismatches ( Fig.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

The Geometry of Concept Learning

Sorscher

Ganguli

Sompolinsky

2021

Preprint

View full text Add to dashboard Cite

Understanding the neural basis of our remarkable cognitive capacity to accurately learn novel high-dimensional naturalistic concepts from just one or a few sensory experiences constitutes a fundamental problem. We propose a simple, biologically plausible, mathematically tractable, and computationally powerful neural mechanism for few-shot learning of naturalistic concepts. We posit that the concepts we can learn given few examples are defined by tightly circumscribed manifolds in the neural firing rate space of higher order sensory areas. We further posit that a single plastic downstream neuron can learn such concepts from few examples using a simple plasticity rule. We demonstrate the computational power of our simple proposal by showing it can achieve high few-shot learning accuracy on natural visual concepts using both macaque inferotemporal cortex representations and deep neural network models of these representations, and can even learn novel visual concepts specified only through language descriptions. Moreover, we develop a mathematical theory of few-shot learning that links neurophysiology to behavior by delineating several fundamental and measurable geometric properties of high-dimensional neural representations that can accurately predict the few-shot learning performance of naturalistic concepts across all our experiments. We discuss several implications of our theory for past and future studies in neuroscience, psychology and machine learning.

show abstract

Section: Discussionsupporting

confidence: 90%

Section: Discussionmentioning

confidence: 99%

The Geometry of Concept Learning

Sorscher

Ganguli

Sompolinsky

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…CNNs can be used to supply representations for more complex naturalistic images, which can be further modified to better reflect human judgments before being input into the same kinds of cognitive model (middle pathways). 50,64,73 End-to-end models offer the opportunity to solve both of these problems simultaneously and learn a representation for naturalistic stimuli that satisfies the constraints inherent in higher-level cognitive models (right pathway). 136 Many other components of the modern deep learning framework arose during this period from the collaboration of psychologists, neuroscientists, and the computational vision community, including the development of hierarchical feed-forward visual models based on stacks of nonlinear feature maps and pooling between layers.…”

Section: Deep and Convolutional Neural Networkmentioning

confidence: 99%

“…This acts to sharpen or lessen the influence of exemplars on subsequent category judgments, and therefore allow the model to controls for overall stimulus discriminability in the relevant psychological space. 94 Battleday et al 50 found that CNNs provided the best representational basis for modeling the human categorization judgments, outperforming deep unsupervised and traditional computer vision methods. Indeed, the choice of stimulus representation affected overall performance to a much greater extent than the choice of categorization model.…”

Section: Similaritymentioning

confidence: 99%

“…There are known cases in which prototype and exemplar models make similar predictions, for example, if category representations are wellcaptured by simple Gaussians, but how these situations relate to high-dimensional representations of complex stimuli remains unclear. Battleday et al 50 conducted a simulation study that investigated how the number of dimensions and training samples affected the different modeling strategies. They found that while indeed in low-dimensional spaces exemplar models outperform prototype models, no such difference exists in high-dimensional representational spaces after training with very large numbers of stimuli.…”

Section: Similaritymentioning

confidence: 99%

See 1 more Smart Citation

From convolutional neural networks to models of higher‐level cognition (and back again)

Battleday

Peterson

Griffiths

2021

Annals of the New York Academy of Sciences

Self Cite

View full text Add to dashboard Cite

The remarkable successes of convolutional neural networks (CNNs) in modern computer vision are by now well known, and they are increasingly being explored as computational models of the human visual system. In this paper, we ask whether CNNs might also provide a basis for modeling higher-level cognition, focusing on the core phenomena of similarity and categorization. The most important advance comes from the ability of CNNs to learn high-dimensional representations of complex naturalistic images, substantially extending the scope of traditional cognitive models that were previously only evaluated with simple artificial stimuli. In all cases, the most successful combinations arise when CNN representations are used with cognitive models that have the capacity to transform them to better fit human behavior. One consequence of these insights is a toolkit for the integration of cognitively motivated constraints back into CNN training paradigms in computer vision and machine learning, and we review cases where this leads to improved performance. A second consequence is a roadmap for how CNNs and cognitive models can be more fully integrated in the future, allowing for flexible end-to-end algorithms that can learn representations from data while still retaining the structured behavior characteristic of human cognition.

show abstract

Grounding Psychological Shape Space in Convolutional Neural Networks

Bechberger

Kühnberger

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Capturing human categorization of natural images by combining deep networks and cognitive models

Cited by 73 publications

References 59 publications

The Geometry of Concept Learning

The Geometry of Concept Learning

From convolutional neural networks to models of higher‐level cognition (and back again)

Grounding Psychological Shape Space in Convolutional Neural Networks

Contact Info

Product

Resources

About