2023
DOI: 10.48550/arxiv.2301.09043
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CodeScore: Evaluating Code Generation by Learning Code Execution

Abstract: A proper code evaluation metric (CEM) profoundly impacts the evolution of code generation, which is an important research field in NLP and software engineering. Prevailing CEMs can be categorized into match-based CEMs (e.g., BLEU, Accuracy, and Code-BLEU) and execution-based CEMs (e.g., Avg-PassRatio and Pass@k), but both of them suffer from some issues. The former only measures differences in surface form regardless of the functional equivalence of codes, while the latter has huge execution overheads, includi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
references
References 13 publications
0
0
0
Order By: Relevance