Introduction:
Recent advancements in generative AI, exemplified by ChatGPT, hold promise for healthcare applications such as decision-making support, education, and patient engagement. However, rigorous evaluation is crucial to ensure reliability and safety in clinical contexts. This scoping review explores ChatGPT’s role in clinical inquiry, focusing on its characteristics, applications, challenges, and evaluation.
Methods:
This review, conducted in 2023, followed PRISMA-ScR guidelines, Supplemental Digital Content 1, http://links.lww.com/MS9/A636. Searches were performed across PubMed, Scopus, IEEE, Web of Science, Cochrane, and Google Scholar using relevant keywords. The review explored ChatGPT’s effectiveness in various medical domains, evaluation methods, target users, and comparisons with other AI models. Data synthesis and analysis incorporated both quantitative and qualitative approaches.
Results:
Analysis of 41 academic studies highlights ChatGPT’s potential in medical education, patient care, and decision support, though performance varies by medical specialty and linguistic context. GPT-3.5, frequently referenced in 26 studies, demonstrated adaptability across diverse scenarios. Challenges include limited access to official answer keys and inconsistent performance, underscoring the need for ongoing refinement. Evaluation methods, including expert comparisons and statistical analyses, provided significant insights into ChatGPT’s efficacy. The identification of target users, such as medical educators and non-expert clinicians, illustrates its broad applicability.
Conclusion:
ChatGPT shows significant potential in enhancing clinical practice and medical education. Nevertheless, continuous refinement is essential for its successful integration into healthcare, aiming to improve patient care outcomes and address the evolving needs of the medical community.