The Algonauts Project 2021 Challenge: How the Human Brain Makes Sense of a World in Motion

Cichy, Radoslaw Martin; Dwivedi, Kshitij; Lahner, Benjamin; Lascelles, Alex; Iamshchinina, Polina; Graumann, Monika; Andonian, Alex; Murty, N. Apurva Ratan; Kay, Kendrick; Roig, Gemma; Oliva, Aude

doi:10.48550/arxiv.2104.13714

Cited by 5 publications

(8 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We focus on predicting the brain response from the corresponding video stimuli. 40 We adopt the general voxel-wise neural encoding framework that has been widely used in the literature. [41][42][43] In particular, DNN models are used to extract feature representations from each individual video stimulus.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Upgrading Voxel-wise Encoding Model via Integrated Integration over Features and Brain Networks

Yang

2022

Preprint

View full text Add to dashboard Cite

A central goal of cognitive neuroscience is to build computational models that predict and explain neural responses to sensory inputs in the cortex. Recent studies attempt to borrow the representation power of deep neural networks (DNN) to predict the brain response and suggest a correspondence between artificial and biological neural networks in their feature representations. However, each DNN instance is often specified for certain computer vision tasks which may not lead to optimal brain correspondence. On the other hand, these voxel-wise encoding models focus on predicting single voxels independently, while brain activity often demonstrates rich and dynamic structures at the population and network levels during cognitive tasks. These two important properties suggest that we can improve the prevalent voxel-wise encoding models by integrating features from DNN models and by integrating cortical network information into the models. In this work, we propose a new unified framework that addresses these two aspects through DNN feature-level ensemble learning and brain atlas-level model integration. Our proposed approach leads to superior performance over previous DNN-based encoding models in predicting whole-brain neural activity during naturalistic video perception. Furthermore, our unified framework also facilitates the investigation of the brain's neural representation mechanism by accurately predicting the neural response corresponding to complex visual concepts.

show abstract

Section: Resultsmentioning

confidence: 99%

“…Details on data acquisition and preprocessing are provided elsewhere. 40 Briefly, the dataset consists of 1102 fMRI brain responses per subject (10 subjects), 1000 for training, and 102 held out for online submission. Each stimulus is a 3-second clip of daily events, participants watched the video without playing the sound.…”

Section: Methodsmentioning

confidence: 99%

Upgrading Voxel-wise Encoding Model via Integrated Integration over Features and Brain Networks

Yang

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Higher values correspond to higher quality. conference (Cichy et al, 2021;Naselaris et al, 2018). For the challenge, participants submit the predictions of their computational model on held-out brain data (see http://algonauts.csail.mit.edu/challenge.html for the final challenge leaderboard and details).…”

Section: Cnr -Contrast Tomentioning

confidence: 99%

“…In pursuit of interdisciplinary and transparent research, we used portions of BMD in the The Algonauts Project 2021: How the Human Brain Makes Sense of a World in Motion. This open challenge, in partnership with the Computational Cognitive Neuroscience (CCN) conference(Cichy et al, 2021;Naselaris et al, 2018), invites participants to predict held-out brain data using their computational models. The top three entries in The Algonauts Project 2021 challenge each took drastically different modeling approaches (see reports in Supplementary), highlighting the creative space opened by BMD lying at the intersection of natural and artificial intelligence research.For a full account of visual event understanding, research needs to look beyond the classical visual brain and into the whole brain, now possible with BMD.…”

mentioning

confidence: 99%

BOLD Moments: modeling short visual events through a video fMRI dataset and metadata

Lahner¹,

Dwivedi

Iamshchinina

et al. 2023

Preprint

View full text Add to dashboard Cite

Grasping the meaning of everyday visual events is a fundamental feat of human intelligence that hinges on diverse neural processes ranging from vision to higher-level cognition. Deciphering the neural basis of visual event understanding requires rich, extensive, and appropriately designed experimental data. However, this type of data is hitherto missing. To fill this gap, we introduce the BOLD Moments Dataset (BMD), a large dataset of whole-brain fMRI responses to over 1,000 short (3s) naturalistic video clips and accompanying metadata. We show visual events interface with an array of processes, extending even to memory, and we reveal a match in hierarchical processing between brains and video-computable deep neural networks. Furthermore, we showcase that BMD successfully captures temporal dynamics of visual events at second resolution. BMD thus establishes a critical groundwork for investigations of the neural basis of visual event understanding.

show abstract

“…We formulate three desiderata for a suitable model of scene categorization: It should predict (1) the neural representations underlying scene categorization, (2) human scene categorization behavior, and (3) their relationship. A potential candidate class for the model are deep convolutional neural networks, which have been shown to predict activity in the visual cortex better than other models (Cichy et al, 2021 ; Schrimpf et al, 2020 ; Kietzmann et al, 2019 ; Yamins et al, 2014 ). A particular instantiation, a recurrent convolutional neural network (RCNN) named BLnet, that is, a model with learned bottom–up as well as lateral connectivity, has been shown to predict RTs in an object categorization task well and better than a range of control models (Spoerer, Kietzmann, Mehrer, Charest, & Kriegeskorte, 2020 ).…”

Section: Introductionmentioning

confidence: 99%

Empirically Identifying and Computationally Modeling the Brain–Behavior Relationship for Human Scene Categorization

Karapetian,

Boyanova,

Pandaram

et al. 2023

Journal of Cognitive Neuroscience

View full text Add to dashboard Cite

Humans effortlessly make quick and accurate perceptual decisions about the nature of their immediate visual environment, such as the category of the scene they face. Previous research has revealed a rich set of cortical representations potentially underlying this feat. However, it remains unknown which of these representations are suitably formatted for decision-making. Here, we approached this question empirically and computationally, using neuroimaging and computational modelling. For the empirical part, we collected EEG data and RTs from human participants during a scene categorization task (natural vs. man-made). We then related neural representations to behavior using a multivariate extension of signal detection theory. We observed a correlation specifically between ∼100 msec and ∼200 msec after stimulus onset, suggesting that the neural scene representations in this time period are suitably formatted for decision-making. For the computational part, we evaluated a recurrent convolutional neural network as a model of brain and behavior. Unifying our previous observations in an image-computable model, recurrent convolutional neural network predicted well the neural representations, the behavioral scene categorization data, as well as the relationship between them. Our results identify and computationally characterize the neural and behavioral correlates of scene categorization in humans.

show abstract

The Algonauts Project 2021 Challenge: How the Human Brain Makes Sense of a World in Motion

Cited by 5 publications

References 7 publications

Upgrading Voxel-wise Encoding Model via Integrated Integration over Features and Brain Networks

Upgrading Voxel-wise Encoding Model via Integrated Integration over Features and Brain Networks

BOLD Moments: modeling short visual events through a video fMRI dataset and metadata

Empirically Identifying and Computationally Modeling the Brain–Behavior Relationship for Human Scene Categorization

Contact Info

Product

Resources

About