2023
DOI: 10.6018/edumed.556511
|View full text |Cite
|
Sign up to set email alerts
|

¿Es capaz “ChatGPT” de aprobar el examen MIR de 2022? Implicaciones de la inteligencia artificial en la educación médica en España

Abstract: Artificial intelligence and natural language processing models have made an entrance into the field of medical education. Among them, the ChatGPT model has been used to try to solve different international medical exams. However, there is no literature which addresses this phenomenon in Europe or other Spanish-speaking countries. The present paper aims at evaluating the ability to answer questions of the ChatGPT model in the 2022 MIR, which grants access to the Spanish postgraduate training system. To this end… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
3
8

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 21 publications
(14 citation statements)
references
References 8 publications
0
3
3
8
Order By: Relevance
“…Another study on the Neurosurgery Oral Board Preparation Question Bank showed that GPT-4 performed with an accuracy of 82.6%, while GPT-3.5 achieved an accuracy of 62.4% [20]. However, in our study, GPT-3.5 performed better on the NLME compared to previous studies where it failed examinations, including the USMLE and Spanish, Japanese, and Chinese NLMEs [2,[21][22][23]. This can be explained by our use of a prompt that resembles the "chain-of-thought prompting approach," in which ChatGPT decomposes multistep problems into smaller and manageable steps to enhance accuracy [24].…”
Section: Principal Findingscontrasting
confidence: 87%
“…Another study on the Neurosurgery Oral Board Preparation Question Bank showed that GPT-4 performed with an accuracy of 82.6%, while GPT-3.5 achieved an accuracy of 62.4% [20]. However, in our study, GPT-3.5 performed better on the NLME compared to previous studies where it failed examinations, including the USMLE and Spanish, Japanese, and Chinese NLMEs [2,[21][22][23]. This can be explained by our use of a prompt that resembles the "chain-of-thought prompting approach," in which ChatGPT decomposes multistep problems into smaller and manageable steps to enhance accuracy [24].…”
Section: Principal Findingscontrasting
confidence: 87%
“…Previous studies have evaluated the efficacy of AI-based models in addressing MCQs across diverse healthcare disciplines, yielding varying outcomes (Newton and Xiromeriti, 2023). These assessments encompassed the utilization of ChatGPT in scenarios such as United States Medical Licensing Examination (USMLE) exam, as well as the performance within fields such as parasitology and ophthalmology, among others in various languages (e.g., Japanese, Chinese, German, Spanish) (Antaki et al, 2023;Carrasco et al, 2023;Friederichs et al, 2023;Huh, 2023;Kung et al, 2023;Takagi et al, 2023;Xiao et al, 2023). A recent scoping review examined the influence of the ChatGPT model on examination outcomes, including instances where ChatGPT demonstrated superior performance compared to students, albeit in a minority of cases (Newton and Xiromeriti, 2023).…”
Section: Introductionmentioning
confidence: 99%
“…A previous Spanish study submitted the same MIR examination to GPT3.5 and obtained a lower score (54.8%) than ours [26]. One explanation of the difference could be the prompt used.…”
Section: Discussionmentioning
confidence: 46%