2023
DOI: 10.3390/electronics12143095
|View full text |Cite
|
Sign up to set email alerts
|

A Testing Framework for AI Linguistic Systems (testFAILS)

Yulia Kumar,
Patricia Morreale,
Peter Sorial
et al.

Abstract: This paper presents an innovative testing framework, testFAILS, designed for the rigorous evaluation of AI Linguistic Systems (AILS), with particular emphasis on the various iterations of ChatGPT. Leveraging orthogonal array coverage, this framework provides a robust mechanism for assessing AI systems, addressing the critical question, “How should AI be evaluated?” While the Turing test has traditionally been the benchmark for AI evaluation, it is argued that current, publicly available chatbots, despite their… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 36 publications
0
7
0
Order By: Relevance
“…Future research will extend beyond natural language processing, focusing on the resilience of state-of-the-art LLMs against multimodal adversarial attacks involving both text and images. The objectives will include evaluating the vulnerability of LLMs to combined text and image-based adversarial attacks and proposing novel strategies to enhance the resilience of multimodal AI systems [36][37][38]. This direction aligns with the evolving landscape of AI, where understanding and countering sophisticated adversarial tactics become increasingly vital.…”
Section: Discussionmentioning
confidence: 99%
“…Future research will extend beyond natural language processing, focusing on the resilience of state-of-the-art LLMs against multimodal adversarial attacks involving both text and images. The objectives will include evaluating the vulnerability of LLMs to combined text and image-based adversarial attacks and proposing novel strategies to enhance the resilience of multimodal AI systems [36][37][38]. This direction aligns with the evolving landscape of AI, where understanding and countering sophisticated adversarial tactics become increasingly vital.…”
Section: Discussionmentioning
confidence: 99%
“…This new direction aims to assess the vulnerability of LLMs to more sophisticated and composite adversarial strategies, thereby contributing to the development of more robust and resilient AI systems. The objectives will include evaluating the vulnerability of LLMs to combined text-and image-based adversarial attacks and proposing novel strategies to enhance the resilience of multimodal AI systems [39][40][41].…”
Section: Future Researchmentioning
confidence: 99%
“…In the context of scientific coding, Cory Merow et al [10] found that AI chatbots could boost scientific coding by assisting scientists in writing code more quickly and accurately. Yulia Kumar et al [11] proposed a comprehensive testing framework for AI linguistic systems using ChatGPT (version 4) as an AI pair programmer. They found that ChatGPT could generate test cases more effectively than human experts, indicating its potential in the M-to-PY conversion for complex projects such as image skeletonization.…”
Section: Related Workmentioning
confidence: 99%