The Invariance of Recognition to the Stretching of Faces is Not Explained by Familiarity or Warping to an Average Face

Hacker, Catrina; Biederman, Irving

doi:10.31234/osf.io/e5hgx

Cited by 15 publications

(3 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Humans can identify objects that are highly distorted or highly degraded. For instance, we can readily identify images of faces that are stretched by a factor of four (Hacker & Biederman, 2018), when images are partly occluded or presented in novel poses (Biederman, 1987), and when various sorts of visual noise are added to the image (Geirhos et al, 2021). By contrast, CNNs are much worse at generalizing under these conditions (Alcorn et al, 2019;Geirhos et al, 2018Geirhos et al, , 2021Wang et al, 2018;Zhu, Tang, Park, Park, & Yuille, 2019).…”

Section: Dnns Are Poor At Identifying Degraded and Deformed Imagesmentioning

confidence: 99%

Deep problems with neural network models of human vision

Bowers¹,

Dujmović²,

Montero³

et al. 2022

Behav Brain Sci

View full text Add to dashboard Cite

Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain datasets (e.g., single cell responses or fMRI data). However, these behavioral and brain datasets do not test hypotheses regarding what features are contributing to good predictions and we show that the predictions may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on making the best predictions. We conclude by briefly summarizing various promising modelling approaches that focus on psychological data.

show abstract

Section: Dnns Are Poor At Identifying Degraded and Deformed Imagesmentioning

confidence: 99%

Deep problems with neural network models of human vision

Bowers¹,

Dujmović²,

Montero³

et al. 2022

Behav Brain Sci

View full text Add to dashboard Cite

show abstract

“…4.1.7 DNNs are poor at identifying degraded and deformed images: Humans can identify objects that are highly distorted or highly degraded. For instance, we can readily identify images of faces that are stretched by a factor of four (Hacker & Biederman, 2018), when images are partly occluded or presented in novel poses (Biederman, 1987), and when various sorts of visual noise are added to the image (Geirhos et al, 2021). By contrast, CNNs are much worse at generalizing under these conditions (Alcorn et al, 2019;Geirhos et al, 2018Geirhos et al, , 2021Wang et al, 2018;Zhu, Tang, Park, Park, & Yuille, 2019).…”

Section: Dnns Fail To Show Uncrowdingmentioning

confidence: 99%

Deep Problems with Neural Network Models of Human Vision

Bowers¹,

Dujmović²,

Montero³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep neural networks (DNNs) have had extraordinary successes in classifying photographic images of objects and are often described as the best models of biological vision. This conclusion is largely based on three sets of findings: (1) DNNs are more accurate than any other model in classifying images taken from various datasets, (2) DNNs do the best job in predicting the pattern of human errors in classifying objects taken from various behavioral benchmark datasets, and (3) DNNs do the best job in predicting brain signals in response to images taken from various brain benchmark datasets (e.g., single cell responses or fMRI data). However, most behavioral and brain benchmarks report the outcomes of observational experiments that do not manipulate any independent variables, and we show that the good prediction on these datasets may be mediated by DNNs that share little overlap with biological vision. More problematically, we show that DNNs account for almost no results from psychological research. This contradicts the common claim that DNNs are good, let alone the best, models of human object recognition. We argue that theorists interested in developing biologically plausible models of human vision need to direct their attention to explaining psychological findings. More generally, theorists need to build models that explain the results of experiments that manipulate independent variables designed to test hypotheses rather than compete on predicting observational data. We conclude by briefly summarizing various promising modelling approaches that focus on psychological data.

show abstract

“…DNNs are poor at identifying degraded and deformed images Humans can identify objects that are highly distorted or highly degraded. For instance, we can readily identify images of faces that are stretched by a factor of four (Hacker & Biederman, 2018), when images are partly occluded or presented in novel poses , and when various sorts of visual noise are added to the image . By contrast, CNNs are much worse at generalizing under these conditions (Alcorn et al, 2019;Geirhos et al, 2018Wang et al, Figure 9.…”

Section: Dnns Fail To Show Uncrowdingmentioning

confidence: 99%