2023
DOI: 10.7759/cureus.50629
|View full text |Cite
|
Sign up to set email alerts
|

ChatGPT Performance in Diagnostic Clinical Microbiology Laboratory-Oriented Case Scenarios

Malik Sallam,
Khaled Al-Salahat,
Eyad Al-Ajlouni

Abstract: Background: Artificial intelligence (AI)-based tools can reshape healthcare practice. This includes ChatGPT which is considered among the most popular AI-based conversational models. Nevertheless, the performance of different versions of ChatGPT needs further evaluation in different settings to assess its reliability and credibility in various healthcare-related tasks. Therefore, the current study aimed to assess the performance of the freely available ChatGPT-3.5 and the paid version ChatGPT-4 in 10 different… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
2

Relationship

4
3

Authors

Journals

citations
Cited by 12 publications
(8 citation statements)
references
References 47 publications
0
6
0
Order By: Relevance
“…An early systematic review by Sallam emphasized the relatively below bar performance of ChatGPT in some topics hindering its current utility in healthcare education 1 . Similarly, multiple later studies confirmed this concern of generating inaccurate content in specific topics (e.g., Radiology, Microbiology) [77][78][79] .…”
Section: Discussionmentioning
confidence: 83%
“…An early systematic review by Sallam emphasized the relatively below bar performance of ChatGPT in some topics hindering its current utility in healthcare education 1 . Similarly, multiple later studies confirmed this concern of generating inaccurate content in specific topics (e.g., Radiology, Microbiology) [77][78][79] .…”
Section: Discussionmentioning
confidence: 83%
“…The real-world application of the METRICS checklist has been valuable in identifying potential research limitations and in enhancing the overall structure and clarity of the reporting of results. These studies also demonstrate the value of the METRICS checklist for guiding researchers toward more rigorous design and transparent reporting of generative AI-based studies in health care [99,100].…”
Section: Renderxmentioning
confidence: 75%
“…The second study investigating the performance of both ChatGPT-3.5 and ChatGPT-4 in the context of diagnostic microbiology case scenarios was conceived based on the METRICS checklist [100]. The prospective incorporation of the METRICS checklist was particularly instrumental in refining the study design and the reporting of results [100].…”
Section: Renderxmentioning
confidence: 99%
“…Lastly, Bard’s agreement was κ=0.903 for Completeness, κ=1 for Accuracy, and κ=0.693 for Relevance. Finally, the overall modified CLEAR (mCLEAR) scores for AI content quality were averaged based on the scores of the two raters and categorized as: “Poor” (1–1.79), “Below average” (1.80–2.59), “Average” (2.60–3.39), “Above average” (3.40–4.19), and “Excellent” (4.20– 5.00) similar to the previous approach in [52].…”
Section: Methodsmentioning
confidence: 99%