Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.728
|View full text |Cite
|
Sign up to set email alerts
|

PyMT5: multi-mode translation of natural language and Python code with transformers

Abstract: Simultaneously modeling source code and natural language has many exciting applications in automated software development and understanding. Pursuant to achieving such technology, we introduce PYMT5, the PYTHON method text-to-text transfer transformer, which is trained to translate between all pairs of PYTHON method feature combinations: a single model that can both predict whole methods from natural language documentation strings (docstrings) and summarize code into docstrings of any common style. We present … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
73
1

Year Published

2021
2021
2025
2025

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 88 publications
(74 citation statements)
references
References 20 publications
0
73
1
Order By: Relevance
“…The ROUGE-L metrics are dramatically improved, and is not necessarily surprising as XPyMT5 is conditioned on much more information than PyMT5. The syntax correctness of our fine-tuned models is slightly lower than the 92.1% reported by Clement et al (2020).…”
Section: Methods Completion Evaluation Resultscontrasting
confidence: 68%
See 3 more Smart Citations
“…The ROUGE-L metrics are dramatically improved, and is not necessarily surprising as XPyMT5 is conditioned on much more information than PyMT5. The syntax correctness of our fine-tuned models is slightly lower than the 92.1% reported by Clement et al (2020).…”
Section: Methods Completion Evaluation Resultscontrasting
confidence: 68%
“…eWASH yields N total training samples from a file with N total methods and class methods. For docstring completion or code summarization, the source contains the method signature and body, and the target contains the desired docstring, and a control code is used to instruct the model which task it is to perform, just like PyMT5 (Clement et al, 2020).…”
Section: Extended Window Access By Syntax Hierarchymentioning
confidence: 99%
See 2 more Smart Citations
“…Code Summarization (Movshovitz-Attias and Cohen, 2013;Allamanis et al, 2016;Iyer et al, 2016;Alon et al, 2019a;Hu et al, 2018;Harer et al, 2019;Ahmad et al, 2020), Bug Detection (Ray et al, 2016;Li et al, 2018b;Russell et al, 2018;, Program Repair (Chen et al, 2019;Lutellier et al, 2020), Code Translation (Chen et al, 2018;Drissi et al, 2018;Xu et al, 2020), Clone Detection (Zhang et al, 2019;Yu et al, 2019;, Code completion (Li et al, 2018a;Hellendoorn and Devanbu, 2017;Parvez et al, 2018) are some of the tasks that are addressed with deep neural solution. While most of the prior approaches use task-specific representation learning, a few works (Alon et al, 2019b;Feng et al, 2020;Lachaux et al, 2020;Clement et al, 2020) attempted to learn transferable representations in an unsupervised fashion. More closely to our work, CodeBERT (Feng et al, 2020) is pre-trained on bimodal data to capture the semantic interaction between the input modalities (i.e., program and natural languages).…”
Section: Deep Learning In Software Engineeringmentioning
confidence: 99%