27th International Conference on Intelligent User Interfaces 2022
DOI: 10.1145/3490099.3511157
|View full text |Cite
|
Sign up to set email alerts
|

Better Together? An Evaluation of AI-Supported Code Translation

Abstract: Generative machine learning models have recently been applied to source code, for use cases including translating code between programming languages, creating documentation from code, and auto-completing methods. Yet, state-of-the-art models often produce code that is erroneous or incomplete. In a controlled study with 32 software engineers, we examined whether such imperfect outputs are helpful in the context of Java-to-Python code translation. When aided by the outputs of a code translation model, participan… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 38 publications
(13 citation statements)
references
References 80 publications
0
13
0
Order By: Relevance
“…Empirical evaluations of this model have shown that, although the quality of its outputs is quite good, those outputs may still be problematic [57]. Echoing the results from Weisz et al [103], human-centered evaluations of Copilot have found that it increases users' feelings of productivity [109], and that almost a third (27%) of its proposed code completions were accepted by users. In a contrasting evaluation, Vaithilingam et al [95] found that while most participants expressed a preference to use Copilot in their daily work, it did not necessarily improve their task completion times or success rates.…”
Section: Code-fluent Foundation Models and Human-centered Evaluations...mentioning
confidence: 94%
See 3 more Smart Citations
“…Empirical evaluations of this model have shown that, although the quality of its outputs is quite good, those outputs may still be problematic [57]. Echoing the results from Weisz et al [103], human-centered evaluations of Copilot have found that it increases users' feelings of productivity [109], and that almost a third (27%) of its proposed code completions were accepted by users. In a contrasting evaluation, Vaithilingam et al [95] found that while most participants expressed a preference to use Copilot in their daily work, it did not necessarily improve their task completion times or success rates.…”
Section: Code-fluent Foundation Models and Human-centered Evaluations...mentioning
confidence: 94%
“…code that is free of syntax or logical errors). Nonetheless, Weisz et al [102] found that software engineers are still interested in using such models in their work, and that the imperfect outputs of these models can even help them produce higher-quality code via human-AI collaboration [103].…”
Section: Code-fluent Foundation Models and Human-centered Evaluations...mentioning
confidence: 99%
See 2 more Smart Citations
“…Factors observed in studies evaluating UX [ 21 ] include appearance, perceptions, performance, availability, and overall satisfaction. Additionally, cognitive load [ 22 ] and efficacy [ 23 ] have also been observed as factors in intelligent user interface evaluation.…”
Section: Background and Related Workmentioning
confidence: 99%