2023
DOI: 10.3390/fi16010004
|View full text |Cite
|
Sign up to set email alerts
|

Development of an Assessment Scale for Measurement of Usability and User Experience Characteristics of Bing Chat Conversational AI

Goran Bubaš,
Antonela Čižmešija,
Andreja Kovačić

Abstract: After the introduction of the ChatGPT conversational artificial intelligence (CAI) tool in November 2022, there has been a rapidly growing interest in the use of such tools in higher education. While the educational uses of some other information technology (IT) tools (including collaboration and communication tools, learning management systems, chatbots, and videoconferencing tools) have been frequently evaluated regarding technology acceptance and usability attributes of those technologies, similar evaluatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 57 publications
0
5
0
Order By: Relevance
“…This study contributes to the growing body of quantitative research whose aim is to evaluate the UX/Usability/Emotions of GenAI tools [29,43,44], specifically GenAI image tools in the design domain [31,45]. This study can assess that our design students consider these platforms to have slightly above-the-average positive Usability levels, with insufficient UX scores, even more so when compared to other products.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…This study contributes to the growing body of quantitative research whose aim is to evaluate the UX/Usability/Emotions of GenAI tools [29,43,44], specifically GenAI image tools in the design domain [31,45]. This study can assess that our design students consider these platforms to have slightly above-the-average positive Usability levels, with insufficient UX scores, even more so when compared to other products.…”
Section: Discussionmentioning
confidence: 99%
“…Some authors underline that these interfaces also have Usability problems that require the user to understand how to write prompts, be capable of writing prose text, and say what he/she wants instead of indicating to the computer how to accomplish the desired result [3]. Because of these specificities, some authors [29] have tried to develop a set of Usability and UX assessment scales for an in-depth evaluation of potentially essential characteristics of platforms like ChatGPT, Bing Chat, and Bard. Recognizing that these conversational interfaces have unique design requirements leads some researchers to investigate the fundamental UX design principles of conversational interface design [30].…”
Section: Literature Reviewmentioning
confidence: 99%
“…This study advances the theoretical foundations of usability testing by standardizing the evaluation process across different methodologies and scenarios. It offers a structured approach that addresses inconsistencies in current practices and contributes to theoretical advancements in computing [63][64][65][66][67], especially when facing instruments with diverse numbers of items per construct [69,70]. Practically, this method facilitates the adoption of mixed-method approaches, expanding the applicability and relevance of heuristic evaluations in the evolving landscape of human-computer interaction [56].…”
Section: Discussionmentioning
confidence: 99%
“…Face validity indicates the extent to which a test appears effective in terms of its stated aims [68]. Mathematically, aligning the number of items per construct enhances reliability [69] and reflects principles from itemresponse theory, emphasizing the significance of each question's contribution to the overall construct [70]. Some research does not follow a uniform scale, and experts have designed proper weight systems to accomplish that task [71][72][73].…”
mentioning
confidence: 99%
“…Face validity indicates the extent to which a test appears effective in terms of its stated aims [64]. Mathematically, aligning the number of items per construct enhances reliability [65] and reflects principles from item-response theory, emphasizing the significance of each question's contribution to the overall construct [66].…”
Section: Evaluators Biasesmentioning
confidence: 99%