2014
DOI: 10.1145/2666356.2594321
|View full text |Cite
|
Sign up to set email alerts
|

Code completion with statistical language models

Abstract: We address the problem of synthesizing code completions for programs using APIs. Given a program with holes, we synthesize completions for holes with the most likely sequences of method calls. Our main idea is to reduce the problem of code completion to a natural-language processing problem of predicting probabilities of sentences. We design a simple and scalable static analysis that extracts sequences of method calls from a large codebase, and index these into a statistical language model. We then e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
297
0
4

Year Published

2016
2016
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 289 publications
(301 citation statements)
references
References 30 publications
0
297
0
4
Order By: Relevance
“…White et al (White et al, 2015) trained RNNs on source code and showed their practicality in code completion. Similarly, Raychev et al (Raychev et al, 2014) used RNNs in code completion to synthesize method call chains in Java code.…”
Section: Prior Workmentioning
confidence: 99%
See 2 more Smart Citations
“…White et al (White et al, 2015) trained RNNs on source code and showed their practicality in code completion. Similarly, Raychev et al (Raychev et al, 2014) used RNNs in code completion to synthesize method call chains in Java code.…”
Section: Prior Workmentioning
confidence: 99%
“…The mental model of the programmer may be something like a language model for speech, but rather applied to code. Language models are typically applied to natural human utterances but they have also been successfully applied to software (Hindle et al, 2012;Raychev et al, 2014;White et al, 2015), and can be used to discover unexpected segments of tokens in source code (Campbell et al, 2014).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…For this task, we employed long short-term memory (LSTM) recurrent neural networks, as they were successfully used by prior work in predicting tokens from source code (Raychev et al, 2014;White et al, 2015). Unlike the prior work, we have trained two models-the forwards model, given a prefix context and returning the distribution of the next token; and the backwards model, given a suffix context and returning the distribution of the previous token.…”
Section: Training the Lstmsmentioning
confidence: 99%
“…NLP techniques such as n-gram and recurrent neural network models were used to synthesize sequences of calls to some APIs, together with their arguments [10]. The combination of NLP and statistical reasoning may be used for other tasks such as the automatic creation of program test cases, default implementation of methods and functions, automatic classification of programs behaviors by using latent semantic analysis [6], and sophisticated code completion from program structures and source comments.…”
mentioning
confidence: 99%