Evaluating the Efficacy of ChatGPT in Navigating the Spanish Medical Residency Entrance Examination (MIR): Promising Horizons for AI in Clinical Medicine

Guillen-Grima, Francisco; Guillen-Aguinaga, Sara; Guillen-Aguinaga, Laura; Alas-Brun, Rosa; Onambele, Luc; Ortega, Wilfrido; Montejo, Rocio; Aguinaga-Ontoso, Enrique; Barach, Paul; Aguinaga-Ontoso, Ines

doi:10.3390/clinpract13060130

Cited by 27 publications

(16 citation statements)

References 120 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Additionally, a recent study showed the superior performance of four generative AI models in English compared to Arabic in infectious disease queries [39], while an earlier study showed the inferior performance of ChatGPT in general health queries in Arabic dialects [35]. Additionally, the inferior performance of AI chatbots was reported in other non-English languages including Chinese [40], Polish [41], and Spanish [42].…”

Section: Discussionmentioning

confidence: 98%

The Performance of OpenAI ChatGPT-4 and Google Gemini in Virology Multiple-Choice Questions: A Comparative Analysis of English and Arabic Responses

Sallam,

Al-Mahzoum,

Almutawaa

et al. 2024

Preprint

View full text Add to dashboard Cite

Background: The integration of artificial intelligence (AI) in healthcare education is inevitable. Understanding the proficiency of generative AI in different languages to answer complex questions is crucial for educational purposes. Objective: To compare the performance ChatGPT-4 and Gemini in answering Virology multiple-choice questions (MCQs) in English and Arabic, while assessing the quality of the generated content. Methods: Both AI models’ responses to 40 Virology MCQs were assessed for correctness and quality based on the CLEAR tool designed for evaluation of AI-generated content. The MCQs were classified into lower and higher cognitive categories based on the revised Bloom’s taxonomy. The study design considered the METRICS checklist for the design and reporting of generative AI-based studies in healthcare. Results: ChatGPT-4 and Gemini performed better in English compared to Arabic, with ChatGPT-4 consistently surpassing Gemini in correctness and CLEAR scores. ChatGPT-4 led Gemini with 80% vs. 62.5% correctness in English compared to 65% vs. 55% in Arabic. For both AI models, superior performance in lower cognitive domains was reported. Conclusion: Both ChatGPT-4 and Gemini exhibited potential in educational applications; nevertheless, their performance varied across languages highlighting the importance of continued development to ensure the effective AI integration in healthcare education globally.

show abstract

Section: Discussionmentioning

confidence: 98%

The Performance of OpenAI ChatGPT-4 and Google Gemini in Virology Multiple-Choice Questions: A Comparative Analysis of English and Arabic Responses

Sallam,

Al-Mahzoum,

Almutawaa

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…Further exploring the advances in different languages, a study on GPT-4-v reveals its significant performance in medical education and assessment with Spanish (Guillen-Grima et al 2023 ). The evaluation of GPT’s performance in the Spanish medical residency entrance examination (MIR) assessed its ability to answer 182 multiple-choice questions across various medical specialties in both Spanish and English.…”

Section: Language Models For Medical Imagingmentioning

confidence: 99%

Advancing medical imaging with language models: featuring a spotlight on ChatGPT

Hu,

Qian,

Pan

et al. 2024

Phys. Med. Biol.

View full text Add to dashboard Cite

This review paper aims to serve as a comprehensive guide and instructional resource for researchers seeking to effectively implement language models in medical imaging research. First, we presented the fundamental principles and evolution of language models, dedicating particular attention to large language models. We then reviewed the current literature on how language models are being used to improve medical imaging, emphasizing a range of applications such as image captioning, report generation, report classification, findings extraction, visual question response systems, interpretable diagnosis and so on. Notably, the capabilities of ChatGPT were spotlighted for researchers to explore its further applications. Furthermore, we covered the advantageous impacts of accurate and efficient language models in medical imaging analysis, such as the enhancement of clinical workflow efficiency, reduction of diagnostic errors, and assistance of clinicians in providing timely and accurate diagnoses. Overall, our goal is to have better integration of language models with medical imaging, thereby inspiring new ideas and innovations. It is our aspiration that this review can serve as a useful resource for researchers in this field, stimulating continued investigative and innovative pursuits of the application of language models in medical imaging.

show abstract

“…Additionally, the unpaid technology of ChatGPT (ChatGPT 3.5) has not been upgraded, with the model not incorporating the latest information beyond 2021 into its training data. In terms of the commercial version, the paid version, ChatGPT 4, outperforms the prior free version, ChatGPT 3.5 [106]. This enhancement has the potential to reduce medical errors and decrease fatigue due to its enhanced processing capability, which includes the ability to handle pictures and more complex data.…”

Section: Ai: Promises Pitfalls and The Unmet Needs For Its Implementa...mentioning

confidence: 99%

Revolutionizing Women’s Health: A Comprehensive Review of Artificial Intelligence Advancements in Gynecology

Brandão,

Mendes,

Martins

et al. 2024

JCM

View full text Add to dashboard Cite

Artificial intelligence has yielded remarkably promising results in several medical fields, namely those with a strong imaging component. Gynecology relies heavily on imaging since it offers useful visual data on the female reproductive system, leading to a deeper understanding of pathophysiological concepts. The applicability of artificial intelligence technologies has not been as noticeable in gynecologic imaging as in other medical fields so far. However, due to growing interest in this area, some studies have been performed with exciting results. From urogynecology to oncology, artificial intelligence algorithms, particularly machine learning and deep learning, have shown huge potential to revolutionize the overall healthcare experience for women’s reproductive health. In this review, we aim to establish the current status of AI in gynecology, the upcoming developments in this area, and discuss the challenges facing its clinical implementation, namely the technological and ethical concerns for technology development, implementation, and accountability.

show abstract

Evaluating the Efficacy of ChatGPT in Navigating the Spanish Medical Residency Entrance Examination (MIR): Promising Horizons for AI in Clinical Medicine

Cited by 27 publications

References 120 publications

The Performance of OpenAI ChatGPT-4 and Google Gemini in Virology Multiple-Choice Questions: A Comparative Analysis of English and Arabic Responses

The Performance of OpenAI ChatGPT-4 and Google Gemini in Virology Multiple-Choice Questions: A Comparative Analysis of English and Arabic Responses

Advancing medical imaging with language models: featuring a spotlight on ChatGPT

Revolutionizing Women’s Health: A Comprehensive Review of Artificial Intelligence Advancements in Gynecology

Contact Info

Product

Resources

About