Injecting structural hints: Using language models to study inductive biases in language learning

Papadimitriou, Isabel; Jurafsky, Dan

doi:10.18653/v1/2023.findings-emnlp.563

Findings of the Association for Computational Linguistics: EMNLP 2023 2023

DOI: 10.18653/v1/2023.findings-emnlp.563

|View full text |Cite

Injecting structural hints: Using language models to study inductive biases in language learning

Isabel Papadimitriou,

Dan Jurafsky

Abstract: Both humans and large language models are able to learn language without explicit structural supervision. What inductive biases make this learning possible? We address this fundamental cognitive question by leveraging transformer language models: we inject inductive bias into language models by pretraining on formally-structured data, and then evaluate the biased learners' ability to learn typologicallydiverse natural languages. Our experimental setup creates a testbed for hypotheses about inductive bias in hu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite1

Independent1

Authors

Journals

Cited by 2 publications

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Learning the Meanings of Function Words From Grounded Language Using a Visual Question Answering Model

Portelance,

Frank,

Jurafsky

2024

Cognitive Science

Self Cite

View full text Add to dashboard Cite

Interpreting a seemingly simple function word like “or,” “behind,” or “more” can require logical, numerical, and relational reasoning. How are such words learned by children? Prior acquisition theories have often relied on positing a foundation of innate knowledge. Yet recent neural‐network‐based visual question answering models apparently can learn to use function words as part of answering questions about complex visual scenes. In this paper, we study what these models learn about function words, in the hope of better understanding how the meanings of these words can be learned by both models and children. We show that recurrent models trained on visually grounded language learn gradient semantics for function words requiring spatial and numerical reasoning. Furthermore, we find that these models can learn the meanings of logical connectives and and or without any prior knowledge of logical reasoning as well as early evidence that they are sensitive to alternative expressions when interpreting language. Finally, we show that word learning difficulty is dependent on the frequency of models' input. Our findings offer proof‐of‐concept evidence that it is possible to learn the nuanced interpretations of function words in a visually grounded context by using non‐symbolic general statistical learning algorithms, without any prior knowledge of linguistic meaning.

show abstract

Learning the Meanings of Function Words From Grounded Language Using a Visual Question Answering Model

Portelance,

Frank,

Jurafsky

2024

Cognitive Science

Self Cite

View full text Add to dashboard Cite

show abstract

Language Learning, Representation, and Processing in Humans and Machines: Introduction to the Special Issue

Apidianaki,

Fourtassi,

Padó

2024

Computational Linguistics

View full text Add to dashboard Cite

Large Language Models (LLMs) and humans acquire knowledge about language without direct supervision. LLMs do so by means of specific training objectives, while humans rely on sensory experience and social interaction. This parallelism has created a feeling in NLP and cognitive science that a systematic understanding of how LLMs acquire and use the encoded knowledge, could provide useful insights for studying human cognition. Conversely, methods and findings from the field of cognitive science have occasionally inspired language model development. Yet, the differences in the way that language is processed by machines and humans—in terms of learning mechanisms, amounts of data used, grounding and access to different modalities—make a direct translation of insights challenging. The aim of this edited volume has been to create a forum of exchange and debate along this line of research, inviting contributions that further elucidate similarities and differences between humans and LLMs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Injecting structural hints: Using language models to study inductive biases in language learning

Cited by 2 publications

References 25 publications

Learning the Meanings of Function Words From Grounded Language Using a Visual Question Answering Model

Learning the Meanings of Function Words From Grounded Language Using a Visual Question Answering Model

Language Learning, Representation, and Processing in Humans and Machines: Introduction to the Special Issue

Contact Info

Product

Resources

About