2023
DOI: 10.21203/rs.3.rs-2787476/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Test for Evaluating Performance in Human-AI Systems

Abstract: Many important uses of AI involve augmenting humans, not replacing them. But there is not yet a widely used and broadly comparable test for evaluating the performance of these human-AI systems relative to humans alone, AI alone, or other baselines. Here we describe such a test and demonstrate its use in three ways. First, in an analysis of 79 recently published results, we find that, surprisingly, the median performance improvement ratio corresponds to no improvement at all, and the maximum improvement is only… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(5 citation statements)
references
References 37 publications
1
4
0
Order By: Relevance
“…our last study, the tool did not show significant effects on performance. This finding is consistent with other work on LLMs [36,37] that also showed that improving performance synergistically can be difficult in various human-AI interaction scenarios [37]. Although GPT-3 produces impressively reasonable results in our studies, subjects remarked that the tool often made suggestions they already thought of.…”
Section: Discussionsupporting
confidence: 92%
See 4 more Smart Citations
“…our last study, the tool did not show significant effects on performance. This finding is consistent with other work on LLMs [36,37] that also showed that improving performance synergistically can be difficult in various human-AI interaction scenarios [37]. Although GPT-3 produces impressively reasonable results in our studies, subjects remarked that the tool often made suggestions they already thought of.…”
Section: Discussionsupporting
confidence: 92%
“…However, when human creativity and judgment are required, AI systems will support humans and increase their productivity but are unlikely to replace them completely. Good examples for this are programming or writing assistants [24,1,36,37,22,3]. When there is no clear division of labor and full automation is not the aim human-AI interaction remains challenging [18,38], and guidelines are needed [39].…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations