Piecing together the meaning of a narrative requires understanding not only the individual words but also the intricate relationships between them. How does the brain construct this kind of rich, contextual meaning from natural language? Recently, a new class of artificial neural networks—based on the Transformer architecture—has revolutionized the field of language modeling. Transformers integrate information across words via multiple layers of structured circuit computations, forming increasingly contextualized representations of linguistic content. In this paper, we deconstruct these circuit computations and analyze the associated "transformations" (alongside the more commonly studied "embeddings") at each layer to provide a fine-grained window onto linguistic computations in the human brain. Using functional MRI data acquired while participants listened to naturalistic spoken stories, we find that these transformations capture a hierarchy of linguistic computations across cortex, with transformations at later layers in the model mapping onto higher-level language areas in the brain. We then decompose these transformations into individual, functionally-specialized "attention heads" and demonstrate that the emergent syntactic computations performed by individual heads correlate with predictions of brain activity in specific cortical regions. These heads fall along gradients corresponding to different layers, contextual distances, and syntactic dependencies in a low-dimensional cortical space. Our findings provide a new basis for using the internal structure of large language models to better capture the cascade of cortical computations that support natural language comprehension.
Introduction Connectome‐based predictive modeling (CPM) is a recently developed machine‐learning‐based framework to predict individual differences in behavior from functional brain connectivity (FC). In these models, FC was operationalized as Pearson's correlation between brain regions’ fMRI time courses. However, Pearson's correlation is limited since it only captures linear relationships. We developed a more generalized metric of FC based on information flow. This measure represents FC by abstracting the brain as a flow network of nodes that send bits of information to each other, where bits are quantified through an information theory statistic called transfer entropy. Methods With a sample of individuals performing a sustained attention task and resting during functional magnetic resonance imaging (fMRI) ( n = 25), we use the CPM framework to build machine‐learning models that predict attention from FC patterns measured with information flow. Models trained on n − 1 participants’ task‐based patterns were applied to an unseen individual's resting‐state pattern to predict task performance. For further validation, we applied our model to two independent datasets that included resting‐state fMRI data and a measure of attention (Attention Network Task performance [ n = 41] and stop‐signal task performance [ n = 72]). Results Our model significantly predicted individual differences in attention task performance across three different datasets. Conclusions Information flow may be a useful complement to Pearson's correlation as a measure of FC because of its advantages for nonlinear analysis and network structure characterization.
The extent to which brain functions are localized or distributed is a foundational question in neuroscience. In the human brain, common fMRI methods such as cluster correction, atlas parcellation, and anatomical searchlight are biased by design toward finding localized representations. Here we introduce the functional searchlight approach as an alternative to anatomical searchlight analysis, the most commonly used exploratory multivariate fMRI technique. Functional searchlight removes any anatomical bias by grouping voxels based only on functional similarity and ignoring anatomical proximity. We report evidence that visual and auditory features from deep neural networks and semantic features from a natural language processing model, as well as object representations, are more widely distributed across the brain than previously acknowledged and that functional searchlight can improve model-based similarity and decoding accuracy. This approach provides a new way to evaluate and constrain computational models with brain activity and pushes our understanding of human brain function further along the spectrum from strict modularity toward distributed representation
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.