2023
DOI: 10.1038/s42256-023-00714-5
|View full text |Cite
|
Sign up to set email alerts
|

Decoding speech perception from non-invasive brain recordings

Alexandre Défossez,
Charlotte Caucheteux,
Jérémy Rapin
et al.

Abstract: Decoding speech from brain activity is a long-awaited goal in both healthcare and neuroscience. Invasive devices have recently led to major milestones in this regard: deep-learning algorithms trained on intracranial recordings can now start to decode elementary linguistic features such as letters, words and audio-spectrograms. However, extending this approach to natural speech and non-invasive brain recordings remains a major challenge. Here we introduce a model trained with contrastive learning to decode self… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 56 publications
(24 citation statements)
references
References 85 publications
0
24
0
Order By: Relevance
“…Our results expand this work to the task of decoding images from MEG data and provide additional insight into how deep learning and subject embeddings help group-level decoding models. In concurrent work, Défossez et al (2022) have also shown the effectiveness of subject embeddings in group-level speech decoding.…”
Section: Discussionmentioning
confidence: 96%
See 1 more Smart Citation
“…Our results expand this work to the task of decoding images from MEG data and provide additional insight into how deep learning and subject embeddings help group-level decoding models. In concurrent work, Défossez et al (2022) have also shown the effectiveness of subject embeddings in group-level speech decoding.…”
Section: Discussionmentioning
confidence: 96%
“…Our results expand this work to the task of decoding images from MEG data and provide additional insight into how deep learning and subject embeddings help group‐level decoding models. In concurrent work, Défossez et al (2022) have also shown the effectiveness of subject embeddings in group‐level speech decoding. They have also compared it to subject‐specific layers as a way of dealing with between‐subject performance and found this latter approach slightly better.…”
Section: Discussionmentioning
confidence: 96%
“…By analyzing such neural activities, we can uncover the encoding mechanisms of semantics in the brain 3 . A variety of neural signals, including EEG, Functional Magnetic Resonance Imaging (fMRI), Electrocorticography (ECoG) are employed in language-related tasks, from academic research like investigating language processing in the brain to practical applications like language decoding in BCI [4][5][6][7][8][9] . Recently, a lot of studies on neurolinguistics utilized both machine learning methods and modern deep learning methods in NLP to explore linguistic-related problems [10][11][12][13][14][15][16] .…”
Section: Background and Summarymentioning
confidence: 99%
“…This mathematical technique transforms the time-domain signal into the frequency domain, revealing the spectrum of frequencies present in the neural recordings. We defined frequency bands of interest-Theta (4-8 Hz), Alpha (8)(9)(10)(11)(12) Hz), Beta (12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and Gamma (30-100 Hz)-to categorize the neural oscillations according to their respective frequency ranges.…”
Section: Technical Validation Classic Sensor-level Eeg Analysismentioning
confidence: 99%
“…To gain further insight into the nature of EEG predictions based on Whisper’s explicit transformation of speech-to-language, we ran comparative analyses against two self-supervised speech models, that are trained entirely on sound data, with no access to language. For this, we selected Wav2Vec2 (Baevski et al 2020) and HuBERT (Hsu et al 2021) that in different studies have provided high accuracy fMRI models (Vaidya et al 2022, Millet et al 2022) and a strong basis for decoding model features from MEG and/or EEG data (Défossez et al 2022, Han et al 2023).…”
Section: Interpretation: How Whisper Differs From Self-supervised Spe...mentioning
confidence: 99%