2021
DOI: 10.1088/1741-2552/abecf0
|View full text |Cite
|
Sign up to set email alerts
|

Data-driven machine learning models for decoding speech categorization from evoked brain responses

Abstract: Objective. Categorical perception (CP) of audio is critical to understand how the human brain perceives speech sounds despite widespread variability in acoustic properties. Here, we investigated the spatiotemporal characteristics of auditory neural activity that reflects CP for speech (i.e. differentiates phonetic prototypes from ambiguous speech sounds). Approach. We recorded 64-channel electroencephalograms as listeners rapidly classified vowel sounds along an acoustic-phonetic continuum. We used support vec… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

3
6
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 103 publications
3
6
0
Order By: Relevance
“…First, we found mouse tracks diverged almost immediately (<100 ms) after stimulus presentation (Figure 4C), much earlier than listeners' collective RTs (∼800 ms). This is well within the timeframe (<250 ms) with which speech categories begin to emerge in auditory-sensory brain activity (Bidelman et al, 2013;Mahmud et al, 2021). Contextual effects due to stimulus history have also been observed in both animal (Lopez Espejo et al, 2019) and human (Carter et al, 2022) superior temporal gyrus.…”
Section: Discussionsupporting
confidence: 67%
“…First, we found mouse tracks diverged almost immediately (<100 ms) after stimulus presentation (Figure 4C), much earlier than listeners' collective RTs (∼800 ms). This is well within the timeframe (<250 ms) with which speech categories begin to emerge in auditory-sensory brain activity (Bidelman et al, 2013;Mahmud et al, 2021). Contextual effects due to stimulus history have also been observed in both animal (Lopez Espejo et al, 2019) and human (Carter et al, 2022) superior temporal gyrus.…”
Section: Discussionsupporting
confidence: 67%
“…Intracranial sources underlying CP micro-states: These brain areas converge with both hypothesis and data-driven work which have shown a similar engagement of these regions in categorical decisions related to speech perception. For example, in our recent study (Mahmud, Yeasin, & Bidelman, 2021) applying machine learning techniques (e.g., neural classifiers, feature mining, stability selection) to full brain, source-reconstructed EEGs, we showed that 13 (out of 68) brain regions of the Desikan-Killiany (DK) (Desikan et al, 2006) ms). Indeed, microstates 1 and 9 identified via Bayesian non-parametric analysis of response RTs, isolate these patterned activities to nearly identical brain areas (STG, IFG; see Fig.…”
Section: Resultsmentioning
confidence: 99%
“…We replicate and extend these prior findings by demonstrating a similar neural network describing listeners' speed of categorical decision (i.e., RTs). It is worth noting that our microstate-based analysis here used an entirely different decoding approached applied to CP decision speeds (RTs) rather than listeners' binary labels of speech sounds as in previous studies (Mahmud et al, 2021). Yet, the consistency of our findings across divergent studies, methods, and behavioral assays is striking.…”
Section: Resultsmentioning
confidence: 99%
“…Frequently employed feature extraction algorithms include common spatial pattern, filter bank common spatial pattern (FBCSP), Fourier transform, power spectrum analysis, Wavelet transform, Autoregressive Model, and so forth [13][14][15][16][17]. Combining the features extracted by these algorithms with traditional classifiers such as linear discriminant analysis, support vector machines (SVM), and nearest neighbor classifiers has made essential contributions to decoding EEG signals [18,19]. However, achieving high decoding accuracy with such manual features is difficult since artificial experience or prior knowledge is added when extracting features.…”
Section: Introductionmentioning
confidence: 99%