2023
DOI: 10.3390/clinpract13060130
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating the Efficacy of ChatGPT in Navigating the Spanish Medical Residency Entrance Examination (MIR): Promising Horizons for AI in Clinical Medicine

Francisco Guillen-Grima,
Sara Guillen-Aguinaga,
Laura Guillen-Aguinaga
et al.

Abstract: The rapid progress in artificial intelligence, machine learning, and natural language processing has led to increasingly sophisticated large language models (LLMs) for use in healthcare. This study assesses the performance of two LLMs, the GPT-3.5 and GPT-4 models, in passing the MIR medical examination for access to medical specialist training in Spain. Our objectives included gauging the model’s overall performance, analyzing discrepancies across different medical specialties, discerning between theoretical … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
14
0
1

Year Published

2024
2024
2025
2025

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 27 publications
(16 citation statements)
references
References 120 publications
1
14
0
1
Order By: Relevance
“…Additionally, a recent study showed the superior performance of four generative AI models in English compared to Arabic in infectious disease queries [39], while an earlier study showed the inferior performance of ChatGPT in general health queries in Arabic dialects [35]. Additionally, the inferior performance of AI chatbots was reported in other non-English languages including Chinese [40], Polish [41], and Spanish [42].…”
Section: Discussionmentioning
confidence: 98%
“…Additionally, a recent study showed the superior performance of four generative AI models in English compared to Arabic in infectious disease queries [39], while an earlier study showed the inferior performance of ChatGPT in general health queries in Arabic dialects [35]. Additionally, the inferior performance of AI chatbots was reported in other non-English languages including Chinese [40], Polish [41], and Spanish [42].…”
Section: Discussionmentioning
confidence: 98%
“…Further exploring the advances in different languages, a study on GPT-4-v reveals its significant performance in medical education and assessment with Spanish (Guillen-Grima et al 2023 ). The evaluation of GPT’s performance in the Spanish medical residency entrance examination (MIR) assessed its ability to answer 182 multiple-choice questions across various medical specialties in both Spanish and English.…”
Section: Language Models For Medical Imagingmentioning
confidence: 99%
“…Additionally, the unpaid technology of ChatGPT (ChatGPT 3.5) has not been upgraded, with the model not incorporating the latest information beyond 2021 into its training data. In terms of the commercial version, the paid version, ChatGPT 4, outperforms the prior free version, ChatGPT 3.5 [106]. This enhancement has the potential to reduce medical errors and decrease fatigue due to its enhanced processing capability, which includes the ability to handle pictures and more complex data.…”
Section: Ai: Promises Pitfalls and The Unmet Needs For Its Implementa...mentioning
confidence: 99%