GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow

Black, Sid; Gao, Leo; Wang, Phil; Leahy, Connor; Biderman, Stella

doi:10.5281/zenodo.5297715

Cited by 134 publications

(142 citation statements)

References 0 publications

Supporting

Mentioning

142

Contrasting

Order By: Relevance

“…The recent emergence of grassroots based open-sourcing initiatives can be attributed to an increasing adoption of the closed-source commercial API access mode of dissemination being used for projects such as GPT-3 [34], CLIP and DALL-E 13 . EleutherAI 14 achieved success by replicating both the WebText dataset (on which GPT-3 was trained) and the GPT-3 model itself by unveiling the Pile dataset [42] and the GPT-Neo [43]/GPT-NeoX [44] models. As indicated in the README section of the LAION Github repository 15 , the primal motivation behind the LAION-400M undertaking was to produce open-source variants of the opaque WIT (WebImageText) dataset, and the CLIP [2] and DALL-E [45] models.…”

Section: Motivational Drive: Open-sourcing the Closed-sourcementioning

confidence: 99%

Multimodal datasets: misogyny, pornography, and malignant stereotypes

Birhane,

Prabhu,

Kahembwe

2021

Preprint

View full text Add to dashboard Cite

We have now entered the era of trillion parameter machine learning models trained on billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has given rise to formidable bodies of critical work that has called for caution while generating these large datasets. These address concerns surrounding the dubious curation practices used to generate these datasets, the sordid quality of alt-text data available on the world wide web, the problematic content of the CommonCrawl dataset often used as a source for training large language models, and the entrenched biases in large-scale visio-linguistic models (such as OpenAI's CLIP model) trained on opaque datasets (WebImageText). In the backdrop of these specific calls of caution, we examine the recently released LAION-400M dataset, which is a CLIP-filtered dataset of Image-Alt-text pairs parsed from the Common-Crawl dataset. We found that the dataset contains, troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content. We outline numerous implications, concerns and downstream harms regarding the current state of large scale datasets while raising open questions for various stakeholders including the AI community, regulators, policy makers and data subjects. Warning: This paper contains NSFW content that some readers may find disturbing, distressing, and/or offensive.

show abstract

Section: Motivational Drive: Open-sourcing the Closed-sourcementioning

confidence: 99%

Multimodal datasets: misogyny, pornography, and malignant stereotypes

Birhane,

Prabhu,

Kahembwe

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…We used GitHub's code review tool 7 to manually classify errors in the code translations. Two authors examined each of the visual diffs produced within GitHub and made comments to label errors and explain the reason for why the code contained that error.…”

Section: Code Quality Measuresmentioning

confidence: 99%

“…For cases in which participants omitted any implementation, we corrected their code by adding the two-method implementation, as it was closer in spirit to the original Java. 7 https://github.com/features/code-review/ 8 https://docs.github.com/en/rest/ 9 Souce lines of code (SLOC) is a metric of the number of source lines of code; it does not include blank or commented lines. We used the cloc utility to compute SLOC for all code artifacts in our study, available at https://github.com/AlDanial/cloc.…”

Section: Code Quality Measuresmentioning

confidence: 99%

“…translations it is allowed to produce for a given input. Similarly, a recent evaluation of the GPT-2 [73], GPT-3 [11], and GPT-Neo [7,33] models on their ability to generate code from natural language found that, although these models are becoming increasingly competent at code generation, their overall performance in producing code that passed test cases was low [41]. For the foreseeable future, human effort will be needed in order to improve the quality of AI-generated code to the level where it is usable in real systems.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Better Together? An Evaluation of AI-Supported Code Translation

Weisz¹,

Müller²,

Ross³

et al. 2022

27th International Conference on Intelligent User Interfaces

View full text Add to dashboard Cite

Generative machine learning models have recently been applied to source code, for use cases including translating code between programming languages, creating documentation from code, and auto-completing methods. Yet, state-of-the-art models often produce code that is erroneous or incomplete. In a controlled study with 32 software engineers, we examined whether such imperfect outputs are helpful in the context of Java-to-Python code translation. When aided by the outputs of a code translation model, participants produced code with fewer errors than when working alone. We also examined how the quality and quantity of AI translations affected the work process and quality of outcomes, and observed that providing multiple translations had a larger impact on the translation process than varying the quality of provided translations. Our results tell a complex, nuanced story about the benefits of generative code models and the challenges software engineers face when working with their outputs. Our work motivates the need for intelligent user interfaces that help software engineers effectively work with generative code models in order to understand and evaluate their outputs and achieve superior outcomes to working alone.CCS Concepts: • Human-centered computing → HCI theory, concepts and models; • Software and its engineering → Designing software; • Computing methodologies → Generative and developmental approaches.

show abstract

“…Additionally, we conduct comparative experiments to verify whether open-source alternatives to GPT-3 could still provide comparable performance gains through data augmentation. As opensource alternatives, GPT-2 (Radford et al) and GPT-neo (Black et al, 2021) were chosen. The latter is a popular alternative to the commercial GPT-3, performing competitively with the smaller versions (ada and babbage) of the counterpart.…”

Section: Language Model Capacitymentioning

confidence: 99%

GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation

Yoo¹,

Park²,

Kang³

et al. 2021

Findings of the Association for Computational Linguistics: EMNLP 2021

View full text Add to dashboard Cite

Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts. Recent studies report that prompt-based direct classification eliminates the need for finetuning but lacks data and inference scalability. This paper proposes a novel data augmentation technique that leverages large-scale language models to generate realistic text samples from a mixture of real samples. We also propose utilizing soft-labels predicted by the language models, effectively distilling knowledge from the large-scale language models and creating textual perturbations simultaneously. We perform data augmentation experiments on diverse classification tasks and show that our method hugely outperforms existing text augmentation methods. We also conduct experiments on our newly proposed benchmark to show that the augmentation effect is not only attributed to memorization. Further ablation studies and a qualitative analysis provide more insights into our approach.

show abstract

GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow

Cited by 134 publications

References 0 publications

Multimodal datasets: misogyny, pornography, and malignant stereotypes

Multimodal datasets: misogyny, pornography, and malignant stereotypes

Better Together? An Evaluation of AI-Supported Code Translation

GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation

Contact Info

Product

Resources

About