2023
DOI: 10.1002/ski2.313
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of large language models in management advice for melanoma: Google's AI BARD, BingAI and ChatGPT

Xin Mu,
Bryan Lim,
Ishith Seth
et al.

Abstract: Large language models (LLMs) are emerging artificial intelligence (AI) technology refining research and healthcare. Their use in medicine has seen numerous recent applications. One area where LLMs have shown particular promise is in the provision of medical information and guidance to practitioners. This study aims to assess three prominent LLMs—Google's AI BARD, BingAI and ChatGPT‐4 in providing management advice for melanoma by comparing their responses to current clinical guidelines and existing literature.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(4 citation statements)
references
References 43 publications
0
4
0
Order By: Relevance
“…Hillmann et al [15] posed questions related to atrial fibrillation and cardiac implantable electronic devices to various chatbots, and similarly to our study, found that ChatGPT scored lower in terms of readability. However, Mu et al [16] and Seth et al [17] asked questions related to melanoma and rhinoplasty, respectively, to chatbots, and while the first study found no significant difference in readability among the chatbots, the other study conducted by Seth et al found ChatGPT and BARD (now called Gemini) to be superior in terms of readability. Haver et al [18], using the ChatGPT-3.5 version, requested to simplify the answers given to questions about breast cancer prevention and screening by entering an additional prompt into ChatGPT, and found that ChatGPT's responses were statistically significantly simplified compared to the original ones.…”
Section: Discussionmentioning
confidence: 99%
“…Hillmann et al [15] posed questions related to atrial fibrillation and cardiac implantable electronic devices to various chatbots, and similarly to our study, found that ChatGPT scored lower in terms of readability. However, Mu et al [16] and Seth et al [17] asked questions related to melanoma and rhinoplasty, respectively, to chatbots, and while the first study found no significant difference in readability among the chatbots, the other study conducted by Seth et al found ChatGPT and BARD (now called Gemini) to be superior in terms of readability. Haver et al [18], using the ChatGPT-3.5 version, requested to simplify the answers given to questions about breast cancer prevention and screening by entering an additional prompt into ChatGPT, and found that ChatGPT's responses were statistically significantly simplified compared to the original ones.…”
Section: Discussionmentioning
confidence: 99%
“…Recommendations by the American Medical Association and the National Institutes of Health suggest that materials related to plastic surgery should be written at a sixth-to eighthgrade reading level [47]. However, recent research indicates that the readability of LLMs surpasses these recommendations, requiring a higher level of patient understanding [48,49]. This discrepancy could potentially undermine the relationship between patients and healthcare providers, representing a significant barrier to the effective implementation of AI-driven chatbot perioperative tools in plastic surgery contexts.…”
Section: Discussionmentioning
confidence: 99%
“…Related research has utilized LLM's semiotic capacity to demonstrate its significant aptitude in answering patient questions and providing management/treatment advice for disease and symptomologies in otolaryngology and across specialties. [32][33][34][35][36] The potential of these models does not stop there; this study's demonstration of LLM language abilities with simple medical information could be used to streamline documentation and triage in the ER or even assist in translation and readability services in the clinic. While this paper's findings are specific to otolaryngology, a broader perspective must be emphasized due to the vast capabilities of LLMs.…”
Section: Comparison To the Literature And Clinical Utilitymentioning
confidence: 99%