Large language models (LLMs) and LLM-driven chatbots such as ChatGPT have shown remarkable capacities in comprehending and producing language. However, their internal workings remain a black box in cognitive terms, and it is unclear whether LLMs and chatbots can develop humanlike characteristics in language use. Cognitive scientists have devised many experiments that probe, and have made great progress in explaining, how people process language. We subjected ChatGPT to 12 of these experiments, preregistered and with 1,000 runs per experiment. In 10 of them, ChatGPT replicated the human pattern of language use. It associated unfamiliar words with different meanings depending on their forms, continued to access recently encountered meanings of ambiguous words, reused recent sentence structures, reinterpreted implausible sentences that were likely to have been corrupted by noise, glossed over errors, drew reasonable inferences, associated causality with different discourse entities according to verb semantics, and accessed different meanings and retrieved different words depending on the identity of its interlocutor. However, unlike humans, it did not prefer using shorter words to convey less informative content and it did not use context to disambiguate syntactic ambiguities. We discuss how these convergences and divergences may occur in the transformer architecture. Overall, these experiments demonstrate that LLM-driven chatbots like ChatGPT are capable of mimicking human language processing to a great extent, and that they have the potential to provide insights into how people learn and use language.
The distributional hypothesis (that the meaning of a word corresponds to the contexts where it occurs) struggles to explain how people represent low-frequency words. When distributional information is lacking, people utilize phonological cues, but those cues indicate broad categories, not word meanings per se. We conducted two preregistered experiments to test the hypothesis that people recruit similar-sounding words to represent and access the meanings of low-frequency words. In Experiment 1, native English speakers made semantic relatedness decisions about a cue word (e.g., dodge) followed by a non-arbitrary target (evade) that overlaps in form and meaning with an attractor word (avoid, which is semantically related to dodge) or by an arbitrary control (elude) that is matched with the non-arbitrary target on formal and distributional similarity to the cue word. As we predicted, participants decided faster and more often that non-arbitrary targets, compared to controls, were semantically related to cues. In Experiment 2, participants made semantic relatedness decisions about sentences containing the same cue and target words (e.g., The kids dodged something and She tried to evade/elude the officer). We used MouseView.js to blur the sentences and create a fovea-like aperture directed by the participant’s cursor, allowing us to measure fixation duration. While we did not observe the predicted difference between non-arbitrary and arbitrary targets themselves, we found a lag effect, with shorter fixations on words following non-arbitrary targets, indicating an advantage in accessing the meanings of non-arbitrary words. These experiments provide evidence that similar-sounding words bolster mental representations of low-frequency words.
Most words are low in frequency, yet a prevailing theory of word meaning (the distributional hypothesis: that words with similar meanings occur in similar contexts) and corresponding computational models struggle to represent low-frequency words. We conducted two preregistered experiments to test the hypothesis that similar-sounding words flesh out deficient semantic representations. In Experiment 1, native English speakers made semantic relatedness decisions about a cue (e.g., dodge) followed either by a target that overlaps in form and meaning with a higher frequency word (evade, which overlaps with avoid) or by a control (elude), matched on distributional and formal similarity to the cue. (Participants did not see higher frequency words like avoid.) As predicted, participants decided faster and more often that overlapping targets, compared to controls, were semantically related to cues. In Experiment 2, participants read sentences containing the same cues and targets (e.g., The kids dodged something and She tried to evade/elude the officer). We used MouseView.js to blur the sentences and create a fovea-like aperture directed by the participant’s cursor, allowing us to approximate fixation duration. While we did not observe the predicted difference at the target region (e.g., evade/elude), we found a lag effect, with shorter fixations on words following overlapping targets, suggesting easier integration of those meanings. These experiments provide evidence that words with overlapping forms and meanings bolster representations of low-frequency words, which supports approaches to natural language processing that incorporate both formal and distributional information and which revises assumptions about how an optimal language will evolve.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.