ChatGPT 3.5 fails to write appropriate multiple choice practice exam questions

Ngo, Alexander; Gupta, Saumya; Perrine, Oliver; Reddy, Rithik; Ershadi, Sherry; Remick, Daniel

doi:10.1016/j.acpath.2023.100099

Cited by 10 publications

(3 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…By specifying "in table format" in the prompt, information about certain topics can be summarized in a table (Figure 1). It can generate outlines, multiple-choice questions with answers and explanations [20], and even simulate conversations between people about a certain topic. There is a possibility that some of the information is incorrect due to the tendency of LLMs to hallucinate [19], so LLM-generated information must be verified with other sources.…”

Section: Educationmentioning

confidence: 99%

“…Not all studies pertaining to ChatGPT and education were positive. Ngo et al [20] used ChatGPT 3.5 to generate questions for an immunology course, but it was able to generate correct questions with answers and explanations in only 32% (19 of 60) of cases. In a separate study, ChatGPT4 was tested with the 2022 American Society for Clinical Pathology resident question bank and it did not fare very well, scoring 60.42% in clinical pathology, 54.94% in anatomic pathology, and garnering an overall score of 56.98% [22].…”

Section: Educationmentioning

confidence: 99%

See 1 more Smart Citation

Applications of Large Language Models in Pathology

Cheng

2024

Bioengineering

View full text Add to dashboard Cite

Large language models (LLMs) are transformer-based neural networks that can provide human-like responses to questions and instructions. LLMs can generate educational material, summarize text, extract structured data from free text, create reports, write programs, and potentially assist in case sign-out. LLMs combined with vision models can assist in interpreting histopathology images. LLMs have immense potential in transforming pathology practice and education, but these models are not infallible, so any artificial intelligence generated content must be verified with reputable sources. Caution must be exercised on how these models are integrated into clinical practice, as these models can produce hallucinations and incorrect results, and an over-reliance on artificial intelligence may lead to de-skilling and automation bias. This review paper provides a brief history of LLMs and highlights several use cases for LLMs in the field of pathology.

show abstract

Section: Educationmentioning

confidence: 99%

Section: Educationmentioning

confidence: 99%

Applications of Large Language Models in Pathology

Cheng

2024

Bioengineering

View full text Add to dashboard Cite

show abstract

“…[13][14][15][16][17] This approach allows for the generation of diverse and complex questions in seconds, offering flexibility and efficiency in item development. However, this AI-driven approach struggles with issues of inaccuracy and inconsistency, 15 especially when good prompting strategies 18 are not employed 19 . In AI-driven item generation, such as with ChatGPT, these issues often emerge due to the model's reliance on its training data, which may not always align perfectly with the specific objectives intended by educators.…”

Section: Introductionmentioning

confidence: 99%

Using a hybrid of artificial intelligence and template-based method in automatic item generation to create multiple-choice questions in medical education: Hybrid AIG

Kıyak,

Kononowicz

2024

Preprint

View full text Add to dashboard Cite

ObjectivesTemplate-based automatic item generation (AIG) is more efficient than traditional item writing but it still heavily relies on expert effort in model development. While non-template-based AIG, leveraging artificial intelligence (AI), offers efficiency, it faces accuracy challenges. We aimed to integrate these approaches for leading to a significant rise in efficiency for AIG without sacrificing accuracy.Material and MethodsWe proposed the Hybrid AIG method that utilizes AI to generate item models (templates) and cognitive models to combine the advantages of the two AIG approaches. The Hybrid AIG consists of seven steps. The first five steps are carried out by an expert in a customized AI environment. Following a final expert review (Step 6), the content in the template can be used for item generation through a traditional (non-AI) software (Step 7). We used two multiple-choice questions for demonstration.ResultsWe demonstrated that AI is capable of generating item models and cognitive models for AIG under the guidance of a human expert. Leveraging AI in template development has substantially reduced the time investment from five hours to less than 10 minutes, and made it significantly less challenging.ConclusionsThe Hybrid AIG method transcends the traditional template-based approach by marrying the “art” that comes from AI as a “black box” with the “science” of algorithmic generation under the oversight of expert as a “marriage registrar”. It does not only capitalize on the strengths of both approaches but also mitigates their weaknesses, offering a human-AI collaboration to increase efficiency in medical education.

show abstract

Using GenAI as Co-author for Teaching Supply Chain Management in Higher Education

Wörner,

Holzapfel

2024

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

ChatGPT 3.5 fails to write appropriate multiple choice practice exam questions

Cited by 10 publications

References 12 publications

Applications of Large Language Models in Pathology

Applications of Large Language Models in Pathology

Using a hybrid of artificial intelligence and template-based method in automatic item generation to create multiple-choice questions in medical education: Hybrid AIG

Using GenAI as Co-author for Teaching Supply Chain Management in Higher Education

Contact Info

Product

Resources

About