2020
DOI: 10.48550/arxiv.2009.08603
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Full-line Code Completion with Neural Language Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…We use the ETH PY150 python dataset (the standard code completion benchmark) provided by Raychev et al [13] to ensure a fair comparison with prior studies [6], [7], [9], [34]. The dataset is collected from open-source software projects in GitHub repositories with non-viral licenses (e.g.…”
Section: Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…We use the ETH PY150 python dataset (the standard code completion benchmark) provided by Raychev et al [13] to ensure a fair comparison with prior studies [6], [7], [9], [34]. The dataset is collected from open-source software projects in GitHub repositories with non-viral licenses (e.g.…”
Section: Datasetmentioning
confidence: 99%
“…Threats to external validity relate to the degree to which our approach can be generalized across other context. We evaluate our PyCoder with 50,000 python files from PY150 dataset which is the dataset used in many literature [3], [6], [7], [9], [11], [12], [34]. We also evaluate the model with the code completion benchmark in CodeXGLUE [3].…”
Section: Threats To Validitymentioning
confidence: 99%
“…Their setting differs from ours in assuming access to a candidate provider of reasonable quality upfront. On the other hand, Svyatkovskiy et al [67] and Wang et al [74] respectively study multilingual and whole line completion using Transformers. Our work is, in every respect, orthogonal to the two aforementioned, as the idea of leveraging fine-tuned relations for better completion is applicable to both settings Lastly, we comment that ML4Code researchers have borrowed successful ideas in the NLP community such as pretraining large Transformers on heterogenous datasets for transfer learning and multi-task learning [24,26,23,35].…”
Section: Related Workmentioning
confidence: 99%
“…While early research focused mostly on narrow API-level completion [5,15,32], modern language models based on neural networks vary from fine-grained, using every possible lexical token type (delimiters, operators, white spaces, keywords, etc.) [14], to coarsegrained, predicting entire lines of code [8,45].…”
Section: Introductionmentioning
confidence: 99%
“…In practice, over time, effective token-level code completion can save the users a lot of effort. However, our approach is easy to extend to other types of completion, and we leave applying the usage of logs for the full-line version of code completion [45] for subsequent work.…”
Section: Introductionmentioning
confidence: 99%