2024
DOI: 10.58496/mjaih/2024/001
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating ChatGPT performance in Arabic dialects: A comparative study showing defects in responding to Jordanian and Tunisian general health prompts

Malik Sallam,
Dhia Mousa

Abstract: Background: The role of artificial intelligence (AI) is increasingly recognized to enhance digital health literacy. There is of particular importance with widespread availability and popularity of AI chatbots such as ChatGPT and its possible impact on health literacy. The involves the need to understand AI models’ performance across different languages, dialects, and cultural contexts. This study aimed to evaluate ChatGPT performance in response to prompting in two different Arabic dialects, namely Tunisian an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
6
3

Relationship

3
6

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 26 publications
0
6
0
Order By: Relevance
“…These include the format of the exam and its time limits and the speci c cohort of students. Finally, the study results was based on prompting the AI-based models in English, which may also limit the generalizability of results based on varying levels of performance of AI models based on languages used [71,72].…”
Section: Discussionmentioning
confidence: 99%
“…These include the format of the exam and its time limits and the speci c cohort of students. Finally, the study results was based on prompting the AI-based models in English, which may also limit the generalizability of results based on varying levels of performance of AI models based on languages used [71,72].…”
Section: Discussionmentioning
confidence: 99%
“…Evaluating the appropriateness of state regulations for changes in technology use is replicable for dynamic systems beyond health and given a literature review where appropriate for specific risks. In particular, research that measures business AI adoption trends and existing regulatory accounting mechanisms can provide governance visibility globally in areas such as financial development or communications facing modernization [23][24][25]…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, the current study aimed to compare the performance of two prominent AI models (ChatGPT-4 versus Gemini) in English and Arabic languages within the specialized eld of Virology. The original hypothesis postulated that generative AI models' performance in English is superior to that in Arabic, inferred based on the presumed higher quality of training data available in English and based on the few reports describing this disparity in language performance [35,36]. Highlighting the critical discrepancies in generative AI performance across languages can lead to identi cation of possible areas for improvement by AI developers.…”
Section: Introductionmentioning
confidence: 99%