Text and image generation from intracranial electroencephalography using an embedding space for text and images

Ikegawa, Yuya; Fukuma, Ryohei; Sugano, Hidenori; Oshino, Satoru; Tani, Naoki; Tamura, Kentaro; Iimura, Yasushi; Suzuki, Hiroharu; Yamamoto, Shota; Fujita, Yuya; Nishimoto, Shinji; Kishima, Haruhiko; Yanagisawa, Takufumi

doi:10.1088/1741-2552/ad417a

Cited by 2 publications

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Image retrieval based on closed-loop visual–semantic neural decoding

Fukuma,

Yanagisawa,

Sugano

et al. 2024

Preprint

View full text Add to dashboard Cite

Neural decoding via the latent space of deep neural network models can infer perceived and imagined images from neural activities, even when the image is novel for the subject and decoder. Brain-computer interfaces (BCIs) using the latent space enable a subject to retrieve intended image from a large dataset on the basis of their neural activities but have not yet been realized. Here, we used neural decoding in a closed-loop condition to retrieve images of the instructed categories from 2.3 million images on the basis of the latent vector inferred from electrocorticographic signals of visual cortices. Using a latent space of contrastive language-image pretraining (CLIP) model, two subjects retrieved images with significant accuracy exceeding 80% for two instructions. In contrast, the image retrieval failed using the latent space of another model, AlexNet. In another task to imagine an image while viewing a different image, the imagery made the inferred latent vector significantly closer to the vector of the imagined category in the CLIP latent space but significantly further away in the AlexNet latent space, although the same electrocorticographic signals from nine subjects were decoded. Humans can retrieve the intended information via a closed-loop BCI with an appropriate latent space.

show abstract