Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages 2019
DOI: 10.1145/3315508.3329976
|View full text |Cite
|
Sign up to set email alerts
|

A case study on machine learning for synthesizing benchmarks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 28 publications
1
4
0
Order By: Relevance
“…We analyze the dataset to explain this phenomenon and find CLgen generates a lot of comments, repeated dead statements and awkward nonhuman-like code such as multiple semi-colons. These results agree with the case study by Goens et al [14] that shows the AST depth distribution of CLgen's code is significantly narrower compared to code from GitHub or standard benchmarks.…”
Section: Analysis Of Benchpress and Clgen Language Modelssupporting
confidence: 91%
See 3 more Smart Citations
“…We analyze the dataset to explain this phenomenon and find CLgen generates a lot of comments, repeated dead statements and awkward nonhuman-like code such as multiple semi-colons. These results agree with the case study by Goens et al [14] that shows the AST depth distribution of CLgen's code is significantly narrower compared to code from GitHub or standard benchmarks.…”
Section: Analysis Of Benchpress and Clgen Language Modelssupporting
confidence: 91%
“…Its synthetic benchmarks improve the accuracy of Grewe's et al predictive model [16] by 1.27×. However, Goens et al [14] perform a case study and show evidence that CLgen's synthetic benchmarks do not improve the quality of training data and, consequently, performance of predictive models. They show that a predictive model in fact performs worse with synthetic benchmarks as opposed to human written benchmarks or code from GitHub.…”
Section: Analysis Of Benchpress and Clgen Language Modelsmentioning
confidence: 97%
See 2 more Smart Citations
“…The various design dimensions such as applied algorithm, architecture, or quantization span a large design space for cascaded classifiers. Adding the substantial variety of existing datasets and their complex feature space, benchmarking different solutions becomes a challenge by itself [7].…”
Section: Introductionmentioning
confidence: 99%