2021
DOI: 10.48550/arxiv.2109.04867
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Studying word order through iterative shuffling

Abstract: As neural language models approach human performance on NLP benchmark tasks, their advances are widely seen as evidence of an increasingly complex understanding of syntax. This view rests upon a hypothesis that has not yet been empirically tested: that word order encodes meaning essential to performing these tasks. We refute this hypothesis in many cases: in the GLUE suite and in various genres of English text, the words in a sentence or phrase can rarely be permuted to form a phrase carrying substantially dif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 17 publications
0
1
0
Order By: Relevance
“…Procedures that force LMs to be more focused on a prompt, or a specific part of it, when generating or ranking tokens can benefit algorithms that search for combinations of words through sampling. It would be interesting to use coherence boosting models in non-autoregressive text generation algorithms, such as to accelerate the mixing of MCMC methods for constrained text generation (e.g., Miao et al (2019); Zhang et al (2020a); Malkin et al (2021)). (Holtzman et al, 2021) is an unconditional probability normalization method, CC (Zhao et al, 2021) is the contextual calibration method and Channel (Min et al, 2021) uses an inverted-LM scoring approach that computes the conditional probability of the input given the label.…”
Section: Discussionmentioning
confidence: 99%
“…Procedures that force LMs to be more focused on a prompt, or a specific part of it, when generating or ranking tokens can benefit algorithms that search for combinations of words through sampling. It would be interesting to use coherence boosting models in non-autoregressive text generation algorithms, such as to accelerate the mixing of MCMC methods for constrained text generation (e.g., Miao et al (2019); Zhang et al (2020a); Malkin et al (2021)). (Holtzman et al, 2021) is an unconditional probability normalization method, CC (Zhao et al, 2021) is the contextual calibration method and Channel (Min et al, 2021) uses an inverted-LM scoring approach that computes the conditional probability of the input given the label.…”
Section: Discussionmentioning
confidence: 99%