PaLM: Scaling Language Modeling with Pathways

Chowdhery, Aakanksha; Narang, Sharan; Devlin, Jacob; Bosma, Maarten; Mishra, Gaurav; Roberts, Adam; Barham, Paul; Chung, Hyung Won; Sutton, Charles; Gehrmann, Sebastian; Schuh, Parker; Shi, Kensen; Tsvyashchenko, Sasha; Maynez, Joshua; Rao, Abhishek; Barnes, Parker; Tay, Yi; Shazeer, Noam; Prabhakaran, Vinodkumar; Reif, Emily; Du, Nan; Hutchinson, Ben; Pope, Reiner; Bradbury, James T.; Austin, Jacob; Isard, Michael; Gur-Ari, Guy; Yin, Pengcheng; Duke, Toju; Levskaya, Anselm; Ghemawat, Sanjay; Dev, Sunipa; Michalewski, Henryk; García, Xavier; Misra, Vedant; Robinson, Kevin; Liam, Fedus,; Zhou, Denny; Ippolito, Daphne; Luan, David; Lim, Hyeontaek; Zoph, Barret; Спиридонов, А.; Sepassi, Ryan; Dohan, D.; Agrawal, Shivani; Omernick, Mark; Dai, Andrew M.; Pillai, Thanumalayan Sankaranarayana; Pellat, Marie; Lewkowycz, Aitor; Moreira, Erica Rodrigues; Child, Rewon; Polozov, Oleksandr; Lee, Katherine; Zhou, Zongwei; Wang, Xuezhi; Saeta, Brennan; Díaz, Mark; Fırat, Orhan; Catasta, Michele; Lee, Jason; Meier-Hellstern, Kathy; Eck, Douglas; Dean, Jeff; Petrov, Slav; Fiedel, Noah

doi:10.48550/arxiv.2204.02311

Cited by 574 publications

(779 citation statements)

References 89 publications

Supporting

Mentioning

522

Contrasting

Order By: Relevance

“…To achieve this, Flamingo takes inspiration from recent work in large-scale generative language models (LMs) which are good few-shot learners (Brown et al, 2020;Chowdhery et al, 2022;Hoffmann et al, 2022;Rae et al, 2021). A single large LM can indeed achieve strong performance on many tasks using only its text interface: a few examples of a task are provided to the model as a prompt, along with a query input, and the model generates a continuation to produce a predicted output for the task on that query.…”

Section: Introductionmentioning

confidence: 99%

Flamingo: a Visual Language Model for Few-Shot Learning

Alayrac¹,

Donahue²,

Luc³

et al. 2022

Preprint

View full text Add to dashboard Cite

ordered alphabetically, † Equal contributions, ordered alphabetically, ‡ Equal senior contributions Building models that can be rapidly adapted to numerous tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. We introduce Flamingo, a family of Visual Language Models (VLM) with this ability. Flamingo models include key architectural innovations to: (i) bridge powerful pretrained vision-only and language-only models, (ii) handle sequences of arbitrarily interleaved visual and textual data, and (iii) seamlessly ingest images or videos as inputs. Thanks to their flexibility, Flamingo models can be trained on large-scale multimodal web corpora containing arbitrarily interleaved text and images, which is key to endow them with in-context few-shot learning capabilities. We perform a thorough evaluation of the proposed Flamingo models, exploring and measuring their ability to rapidly adapt to a variety of image and video understanding benchmarks. These include open-ended tasks such as visual question-answering, where the model is prompted with a question which it has to answer, captioning tasks, which evaluate the ability to describe a scene or an event, and close-ended tasks such as multiple choice visual question-answering. For tasks lying anywhere on this spectrum, we demonstrate that a single Flamingo model can achieve a new state of the art for few-shot learning, simply by prompting the model with task-specific examples. On many of these benchmarks, Flamingo actually surpasses the performance of models that are fine-tuned on thousands of times more task-specific data.

show abstract

Section: Introductionmentioning

confidence: 99%

Flamingo: a Visual Language Model for Few-Shot Learning

Alayrac¹,

Donahue²,

Luc³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Large language models trained on vast repositories of code have demonstrated remarkable progress in neural program synthesis and related tasks [18,10,68,46,21]. However, such models generate code left-to-right, which makes them less directly applicable to many ubiquitous code editing tasks, such as fixing bugs, adding comments, or re-naming variables.…”

Section: Introductionmentioning

confidence: 99%

“…We compare to results obtained from their API on an infilling task in Table10in the Appendix 17. While this setting is not directly comparable to the three-shot setting where the models of Austin et al[10] and Chowdhery et al[21] performed best, we found that our model did not benefit from additional examples in the prompt, which we attribute to much smaller size of our model (6.7B, versus 137B or 540B parameters) and the sensitivity of in-context learning to model scale.…”

mentioning

confidence: 99%

InCoder: A Generative Model for Code Infilling and Synthesis

Fried¹,

Aghajanyan²,

Lin³

et al. 2022

Preprint

View full text Add to dashboard Cite

Code is seldom written in a single left-to-right pass and is instead repeatedly edited and refined. We introduce INCODER, a unified generative model that can perform program synthesis (via left-to-right generation) as well as editing (via infilling). InCoder is trained to generate code files from a large corpus of permissively licensed code, where regions of code have been randomly masked and moved to the end of each file, allowing code infilling with bidirectional context. Our model is the first large generative code model that is able to infill arbitrary regions of code, which we evaluate in a zero-shot setting on challenging tasks such as type inference, comment generation, and variable re-naming. We find that the ability to condition on bidirectional context substantially improves performance on these tasks, while still performing comparably on standard program synthesis benchmarks in comparison to left-to-right only models pretrained at similar scale. The INCODER models and code are publicly released.2 * Equal contribution 2 https://sites.google.com/view/incoder-code-models

show abstract

“…To this end, we use examples from IMPLICITRELATIONS in a few-shot in-context learning setting , where given several input-output examples and a test input, the LM is expected to generate the required output. We focus on this setup following the recent progress in in-context learning, specifically for tasks that involve general commonsense reasoning (Da et al, 2021;Chowdhery et al, 2022).…”

Section: Experimental Settingmentioning

confidence: 99%

“…Recent work Smith et al, 2022;Chowdhery et al, 2022) has shown that the reasoning abilities of LMs improve with model size. We evaluate this effect on four models from the GPT-3 family: ada, babbage, curie, and davinci, which are assumed to have been trained using the same procedure, and are estimated to have 350M, 1.3B, 6.7B, and 175B parameters, respectively (Gao, 2021;Black et al, 2022).…”

Section: Effect Of Model Sizementioning

confidence: 99%

Inferring Implicit Relations with Language Models

Katz¹,

Geva²,

Berant³

2022

Preprint

View full text Add to dashboard Cite

A prominent challenge for modern language understanding systems is the ability to answer implicit reasoning questions, where the required reasoning steps for answering the question are not mentioned in the text explicitly. In this work, we investigate why current models struggle with implicit reasoning question answering (QA) tasks, by decoupling inference of reasoning steps from their execution. We define a new task of implicit relation inference and construct a benchmark, IMPLICITRE-LATIONS, where given a question, a model should output a list of concept-relation pairs, where the relations describe the implicit reasoning steps required for answering the question. Using IMPLICITRELATIONS, we evaluate models from the GPT-3 family and find that, while these models struggle on the implicit reasoning QA task, they often succeed at inferring implicit relations. This suggests that the bottleneck for answering implicit reasoning questions is in the ability of language models to retrieve and reason over information rather than to plan an accurate reasoning strategy.Recent advances in QA Lourie et al., 2021) have steered attention towards implicit reasoning QA benchmarks such as STRAT-EGYQA (Geva et al., 2021), OPENCSR (Lin et al., 2021), COMMONSENSEQA 2.0 (Talmor et al., 2021, CREAK (Onoe et al., 2021), and REALFP (Kalyan et al., 2021), which span a wide range of domains and reasoning skills. Still, implicit reasoning remains an open challenge, even for large language models (LMs) such as GPT-3 and PaLM (BIG-bench collab.

show abstract

PaLM: Scaling Language Modeling with Pathways

Cited by 574 publications

References 89 publications

Flamingo: a Visual Language Model for Few-Shot Learning

Flamingo: a Visual Language Model for Few-Shot Learning

InCoder: A Generative Model for Code Infilling and Synthesis

Inferring Implicit Relations with Language Models

Contact Info

Product

Resources

About