2023
DOI: 10.1101/2023.03.24.23287731
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Performance of ChatGPT on free-response, clinical reasoning exams

Abstract: Importance: Studies show that ChatGPT, a general purpose large language model chatbot, could pass the multiple-choice US Medical Licensing Exams, but the model's performance on open-ended clinical reasoning is unknown. Objective: To determine if ChatGPT is capable of consistently meeting the passing threshold on free-response, case-based clinical reasoning assessments. Design: Fourteen multi-part cases were selected from clinical reasoning exams administered to pre-clerkship medical students between 2019 and 2… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 35 publications
(8 citation statements)
references
References 2 publications
0
8
0
Order By: Relevance
“…This could lead to faster and more accurate diagnoses, improving patient outcomes. [3][4][5] Second, ChatGPT can help synthesize and analyze vast amounts of medical literature, which could lead to the discovery of new treatments, medications, or a better understanding of diseases. The potential of ChatGPT in medical writing is currently the most studied and discussed in the literature.…”
mentioning
confidence: 99%
“…This could lead to faster and more accurate diagnoses, improving patient outcomes. [3][4][5] Second, ChatGPT can help synthesize and analyze vast amounts of medical literature, which could lead to the discovery of new treatments, medications, or a better understanding of diseases. The potential of ChatGPT in medical writing is currently the most studied and discussed in the literature.…”
mentioning
confidence: 99%
“…This indicates a significant degree of variability in ChatGPT's responses, even when faced with identical scenarios. 42 Another study aimed to evaluate ChatGPT's capacity for ongoing clinical decision support. The research involved inputting published clinical vignettes into ChatGPT-3.5 and assessing its accuracy in various areas such as differential diagnoses, diagnostic testing, final diagnosis, and management.…”
Section: Discussionmentioning
confidence: 99%
“…Yet, any evaluation of these tools must be context-specific and rigorous. This assessment is particularly relevant, urgent, and novel for complex conditions, such as ODS, where optimal multidisciplinary diagnostic and therapeutic paradigms are challenging to establish due to a relatively scarce body of recently published evidence [ 11 ]. This study aimed to evaluate the reliability of two versions of the ChatGPT LLM in managing ODS.…”
Section: Discussionmentioning
confidence: 99%