Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.170
|View full text |Cite
|
Sign up to set email alerts
|

If beam search is the answer, what was the question?

Abstract: Quite surprisingly, exact maximum a posteriori (MAP) decoding of neural language generators frequently leads to low-quality results (Stahlberg and Byrne, 2019). Rather, most state-of-the-art results on language generation tasks are attained using beam search despite its overwhelmingly high search error rate. This implies that the MAP objective alone does not express the properties we desire in text, which merits the question: if beam search is the answer, what was the question? We frame beam search as the exac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
72
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 73 publications
(74 citation statements)
references
References 35 publications
2
72
0
Order By: Relevance
“…Note the tendency of LMs to assign unreasonably high probabilities to segments has also attracted attention from the viewpoint of memorization capability of LMs (Carlini et al, 2020). In addition, the connection of the UID hypothesis to the modern NLP techniques has been recently explored (Meister et al, 2020;Wei et al, 2021). We further investigate our hypothesis in Section 5.…”
Section: Discussion: Uniform Information Densitymentioning
confidence: 99%
“…Note the tendency of LMs to assign unreasonably high probabilities to segments has also attracted attention from the viewpoint of memorization capability of LMs (Carlini et al, 2020). In addition, the connection of the UID hypothesis to the modern NLP techniques has been recently explored (Meister et al, 2020;Wei et al, 2021). We further investigate our hypothesis in Section 5.…”
Section: Discussion: Uniform Information Densitymentioning
confidence: 99%
“…Specifically, the RL model improves BLEU scores on long sentences by 3+ BLEU points and BP on those sentences by about 9+ points. This shows that our model, via smart segmentation, suffers less because of premature truncation of long translations as compared to the baseline-a common problem (Meister et al, 2020;Koehn and Knowles, 2017). While segmentation of long sentences at appropriate punctuations helps performance, segmentation at all punctuations is expected to hurt performance as it is highly likely to produce extremely small segments which lose a lot of necessary source context when individually translated.…”
Section: Resultsmentioning
confidence: 96%
“…For tasks like MT, this is not the case: Eikema and Aziz (2020) pointed out that the argmax receives so little mass that it is almost arbitrary, so seeking it with MAP decoding (which beam search approximates) itself causes many deficiencies of decoding. On the other hand, Meister et al (2020a) showed that beam search has a helpful bias and introduced regularization penalties for MAP decoding that encode it explicitly. Entmax neither directly addresses the faults of MAP decoding nor compensates for the locality biases of beam search, instead shrinking the gap between beam search and exact decoding.…”
Section: Related Workmentioning
confidence: 99%