2022
DOI: 10.1101/2022.11.18.517004
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

High-resolution image reconstruction with latent diffusion models from human brain activity

Abstract: Reconstructing visual experiences from human brain activity offers a unique way to understand how the brain represents the world, and to interpret the connection between computer vision models and our visual system. While deep generative models have recently been employed for this task, reconstructing realistic images with high semantic fidelity is still a challenging problem. Here, we propose a new method based on a diffusion model (DM) to reconstruct images from human brain activity obtained via functional m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
51
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 51 publications
(51 citation statements)
references
References 48 publications
0
51
0
Order By: Relevance
“…For example, variational autoencoders (VAEs) have been applied to infer low-dimensional representations of single-trial neural population dynamics (Pandarinath et al, 2018), while generative adversarial networks (GANs) have been used for the task of spike-train generation (Molano-Mazon et al, 2018, Ramesh et al, 2019), as well as to decode images from single neuron and fMRI data (Ponce et al, 2019, Lin et al, 2022). The recently proposed denoising diffusion probabilistic models (DDPMs) have also been applied to improve neural decoding performance, in particular leveraging latent diffusion models (Rombach et al, 2022) to predict viewed images from fMRI data (Takagi and Nishimoto, 2022, Chen et al, 2023).…”
Section: Introductionmentioning
confidence: 99%
“…For example, variational autoencoders (VAEs) have been applied to infer low-dimensional representations of single-trial neural population dynamics (Pandarinath et al, 2018), while generative adversarial networks (GANs) have been used for the task of spike-train generation (Molano-Mazon et al, 2018, Ramesh et al, 2019), as well as to decode images from single neuron and fMRI data (Ponce et al, 2019, Lin et al, 2022). The recently proposed denoising diffusion probabilistic models (DDPMs) have also been applied to improve neural decoding performance, in particular leveraging latent diffusion models (Rombach et al, 2022) to predict viewed images from fMRI data (Takagi and Nishimoto, 2022, Chen et al, 2023).…”
Section: Introductionmentioning
confidence: 99%
“…Recently, by incorporating the assistance of deep neural networks (DNNs) 12,13 and generative models [14][15][16][17][18][19][20][21] , several studies have achieved higher-fidelity natural image reconstruction [22][23][24][25][26] , which has become a tool for investigating the visual processing in the brain (e.g., visual representation, attention 27 , and illusion 28 ).…”
Section: Introductionmentioning
confidence: 99%
“…Other studies have decoded seen natural images 9, 10 or videos 11 using visual features inspired by neurophysiological discoveries. Recently, by incorporating the assistance of deep neural networks (DNNs) 12, 13 and generative models 14–21 , several studies have achieved higher-fidelity natural image reconstruction 2226 , which has become a tool for investigating the visual processing in the brain (e.g., visual representation, attention 27 , and illusion 28 ).…”
Section: Introductionmentioning
confidence: 99%
“…Deep convolutional neural networks (DCNNs) have entered the computational modeling scene with high predictive performance of both object category and brain dynamics during object categorization tasks (1)(2)(3)(4). These predictions on brain dynamics are not limited to lowlevel image statistics but also include high-level features such as animacy, object category and semantics (5)(6)(7)(8)(9). In fact, DCNNs' predictive performance on visual processes surpassed hand-engineered, biologically-inspired models (e.g.…”
Section: Introductionmentioning
confidence: 99%