2024
DOI: 10.1016/j.acpath.2023.100099
|View full text |Cite
|
Sign up to set email alerts
|

ChatGPT 3.5 fails to write appropriate multiple choice practice exam questions

Alexander Ngo,
Saumya Gupta,
Oliver Perrine
et al.
Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…By specifying "in table format" in the prompt, information about certain topics can be summarized in a table (Figure 1). It can generate outlines, multiple-choice questions with answers and explanations [20], and even simulate conversations between people about a certain topic. There is a possibility that some of the information is incorrect due to the tendency of LLMs to hallucinate [19], so LLM-generated information must be verified with other sources.…”
Section: Educationmentioning
confidence: 99%
See 1 more Smart Citation
“…By specifying "in table format" in the prompt, information about certain topics can be summarized in a table (Figure 1). It can generate outlines, multiple-choice questions with answers and explanations [20], and even simulate conversations between people about a certain topic. There is a possibility that some of the information is incorrect due to the tendency of LLMs to hallucinate [19], so LLM-generated information must be verified with other sources.…”
Section: Educationmentioning
confidence: 99%
“…Not all studies pertaining to ChatGPT and education were positive. Ngo et al [20] used ChatGPT 3.5 to generate questions for an immunology course, but it was able to generate correct questions with answers and explanations in only 32% (19 of 60) of cases. In a separate study, ChatGPT4 was tested with the 2022 American Society for Clinical Pathology resident question bank and it did not fare very well, scoring 60.42% in clinical pathology, 54.94% in anatomic pathology, and garnering an overall score of 56.98% [22].…”
Section: Educationmentioning
confidence: 99%
“…[13][14][15][16][17] This approach allows for the generation of diverse and complex questions in seconds, offering flexibility and efficiency in item development. However, this AI-driven approach struggles with issues of inaccuracy and inconsistency, 15 especially when good prompting strategies 18 are not employed 19 . In AI-driven item generation, such as with ChatGPT, these issues often emerge due to the model's reliance on its training data, which may not always align perfectly with the specific objectives intended by educators.…”
Section: Introductionmentioning
confidence: 99%