CoTexT: Multi-task Learning with Code-Text Transformer

Phan, Long; Tran, Hieu; Le, Daniel; Nguyen, Hieu; Annibal, James; Peltekian, Alec; Ye, Yanfang

doi:10.18653/v1/2021.nlp4prog-1.5

Cited by 70 publications

(43 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…PLBART [3] employed a denoising objective over code and natural language via token masking, token deletion, and token infilling as noising strategies. CoText [54] was build on top of T5 with a special focus on multi-task learning over multiple programming languages. Another model variant was introduced by Fried et al [27,InCoder].…”

Section: Pre-trained Language Models For Programming Languagementioning

confidence: 99%

See 1 more Smart Citation

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

Christopoulou¹,

Λάμπουρας²,

Gritta³

et al. 2022

Preprint

View full text Add to dashboard Cite

We present PANGU-CODER, a pretrained decoder-only language model adopting the PANGU-α architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PANGU-CODER using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second stage uses a combination of Causal Language Modelling and Masked Language Modelling (MLM) training objectives that focus on the downstream task of text-to-code generation and train on loosely curated pairs of natural language program definitions and code functions. Finally, we discuss PANGU-CODER-FT, which is fine-tuned on a combination of competitive programming problems and code with continuous integration tests. We evaluate PANGU-CODER with a focus on whether it generates functionally correct programs and demonstrate that it achieves equivalent or better performance than similarly sized models, such as CodeX [16], while attending a smaller context window and training on less data.

show abstract

Section: Pre-trained Language Models For Programming Languagementioning

confidence: 99%

“…to the Biomedical [37,46,9], Legal [14], Cyber Security [2], and Finance [6] domains, while simultaneously expanding to include signals from modalities other than natural language, e.g. Vision [65,18,15,13,64,25], Proteins [11], Time Series [71,72,55] and Code [42,70,26,31,54].…”

Section: Introductionmentioning

confidence: 99%

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

Christopoulou¹,

Λάμπουρας²,

Gritta³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…A short time ago, BARTpho , a large pretrained sequence-to-sequence model for Vietnamese inheriting BART style (Lewis et al, 2019), demonstrated the effectiveness of pretrained language models on Vietnamese abstractive summarization. Nevertheless, there are some past works that have shown that T5 architecture (Raffel et al, 2019) might outperform BART in some aspects (i.e., (Phan et al, 2021a)). Inspired by that, we propose ViT5, trained on the Vietnamese monolingual subset of CC100, following the architecture and training methodology in Raffel et al (2019).…”

Section: Introductionmentioning

confidence: 99%

ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation

Phan¹,

Tran²,

Nguyen³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

We present ViT5, a pretrained Transformerbased encoder-decoder model for the Vietnamese language.With T5-style selfsupervised pretraining, ViT5 is trained on a large corpus of high-quality and diverse Vietnamese texts. We benchmark ViT5 on two downstream text generation tasks, Abstractive Text Summarization and Named Entity Recognition. Although Abstractive Text Summarization has been widely studied for the English language thanks to its rich and large source of data, there has been minimal research into the same task in Vietnamese, a much lower resource language. In this work, we perform exhaustive experiments on both Vietnamese Abstractive Summarization and Named Entity Recognition, validating the performance of ViT5 against many other pretrained Transformer-based encoderdecoder models. Our experiments show that ViT5 significantly outperforms existing models and achieves state-of-the-art results on Vietnamese Text Summarization. On the task of Named Entity Recognition, ViT5 is competitive against previous best results from pretrained encoder-based Transformer models. Further analysis shows the importance of context length during the self-supervised pretraining on downstream performance across different settings.

show abstract

“…Encouraged by the excellent recent results of data-driven APR approaches [15,4,30,21,5], in this work we wanted to investigate whether is it worth ne-tuning the GPT-2 model. At the time of writing this article the top three approaches are CoTexT [22], PLBART [1] and DeepDebug [5]. Although none of these approaches use the GPT-2 model, their operating principle is similar.…”

Section: Introductionmentioning

confidence: 99%

Fine-Tuning GPT-2 to Patch Programs, Is It Worth It?

Lajkó

Horváth

Csuvik

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The application of Articial Intelligence (AI) in the Software Engineering (SE) eld is always a bit delayed compared to state-ofthe-art research results. While the Generative Pre-trained Transformer (GPT-2) model was published in 2018, only a few recent works used it to SE tasks. One of such task is Automated Program Repair (APR), where the applied technique should nd a x to software bugs without human intervention. One problem emerges here: the creation of proper training data is resource intensive and requires several hours of additional work from researchers. The sole reason of it is that training a model to repair programs automatically requires both the buggy program and the xed one in large scale and presumably in an already pre-processed form. There are currently few such databases, so teaching and ne-tuning models is not an easy task. In this work we wanted to investigate how the GPT-2 model performs when it is not ne-tuned for the APR task, compered to when it is ne-tuned. From previous work we already know that the GPT-2 model can automatically generate patches for buggy programs, although the literature lacks of studies where no ne-tuning has taken place. For the sake of experiment we evaluated the GPT-2 model out-of-the-box and also ne-tuned it before the evaluation on 1559 JavaSript code snippets. Based on out results we can conclude that although the ne-tuned model was able to learn how to write syntactically correct source code almost on every attempt, the non-ne-tuned model lacked some of these positive features.

show abstract

CoTexT: Multi-task Learning with Code-Text Transformer

Cited by 70 publications

References 13 publications

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation

Fine-Tuning GPT-2 to Patch Programs, Is It Worth It?

Contact Info

Product

Resources

About