Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

Wilcox, Ethan; Qian, Peng; Futrell, Richard; Kohita, Ryosuke; Lévy, Roger; Ballesteros, Miguel

doi:10.18653/v1/2020.emnlp-main.375

Cited by 6 publications

(1 citation statement)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The fact that tokens seen only a few times are generally expected to be able to take direct objects suggests a transitivity learning bias in the model. Such a bias would align with recent work assessing few-shot learning of syntactic categories, specifically Jumelet et al (2019), who hypothesize that models learn default category for number and gender, and Wilcox et al (2020), who provide data from few-shot learning tests that is consistent with the hypotheses in Jumelet et al (2019). Interestingly, the results form Wilcox et al (2020) also suggest that the models tested learn a default transitive category for verbs, although they test Recurrent Neural Network models, not transformers, so more careful cross model comparisons are needed.…”

Section: Psycholinguistic Assessment Resultssupporting

confidence: 52%

Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization

Thrush

Wilcox

Lévy

2020

Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Self Cite

View full text Add to dashboard Cite

Previous studies investigating the syntactic abilities of deep learning models have not targeted the relationship between the strength of the grammatical generalization and the amount of evidence to which the model is exposed during training. We address this issue by deploying a novel word-learning paradigm to test BERT's (Devlin et al., 2018) few-shot learning capabilities for two aspects of English verbs: alternations and classes of selectional preferences. For the former, we fine-tune BERT on a single frame in a verbal-alternation pair and ask whether the model expects the novel verb to occur in its sister frame. For the latter, we fine-tune BERT on an incomplete selectional network of verbal objects and ask whether it expects unattested but plausible verb/object pairs. We find that BERT makes robust grammatical generalizations after just one or two instances of a novel word in fine-tuning. For the verbal alternation tests, we find that the model displays behavior that is consistent with a transitivity bias: verbs seen few times are expected to take direct objects, but verbs seen with direct objects are not expected to occur intransitively. The code for our experiments is available at https://github.com/TristanThrush/ few-shot-lm-learning.

show abstract

Section: Psycholinguistic Assessment Resultssupporting

confidence: 52%

Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization

Thrush

Wilcox

Lévy

2020

Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Self Cite

View full text Add to dashboard Cite

show abstract

Ai-enabled automated common vulnerability scoring from common vulnerabilities and exposures descriptions

Zhang,

Kumar,

Pfahringer

et al. 2024

Int. J. Inf. Secur.

View full text Add to dashboard Cite

Exploring the applicability of the experiment-based ANN and LSTM models for streamflow estimation

Akiner,

Kartal,

Guzeler

et al. 2024

Earth Sci Inform

View full text Add to dashboard Cite

The Yeşilırmak River Basin in northern Türkiye is crucial for the region’s water supply, agriculture, hydroelectric power generation, and clean drinking water. The primary goal of this study is to determine which modeling approach is most appropriate for various locations within the basin and how well meteorological data can predict river flow rates. Hydrological and meteorological forecasting both depend on the prediction of river flow rates. An artificial neural network (ANN), Univariate and Multivariate Long Short-Term Memory (LSTM) models have been utilized for streamflow forecasting. This research aims to determine the best model for several provinces in the basin area and give decision-makers a tool for reliable river flow rate estimates by combining LSTM and ANN models. According to research findings, the supervised multivariate LSTM model performed better than the unsupervised model in accuracy and precision. The sliding window methodology is suitable for estimating river flow based on meteorological datasets because it offers a primary method for reinterpreting time-series data in a supervised learning style. Compared to LSTM models, the ANN model that has been statistically optimized through experiments (DoE) design performs better in forecasting the river flow rate in the Yeşilırmak River basin (R2 = 0.98, RMSE = 0.18). The study’s findings provided prospective cognitive models for the strategic management of water resources by forecasting future data from flow monitoring stations.

show abstract

Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

Cited by 6 publications

References 34 publications

Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization

Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization

Ai-enabled automated common vulnerability scoring from common vulnerabilities and exposures descriptions

Exploring the applicability of the experiment-based ANN and LSTM models for streamflow estimation

Contact Info

Product

Resources

About