2023
DOI: 10.1016/j.mcpdig.2023.05.004
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Fake It: Limited Responses and Fabricated References Provided by ChatGPT for Medical Questions

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
22
0
2

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 71 publications
(25 citation statements)
references
References 29 publications
1
22
0
2
Order By: Relevance
“…False responses by ChatGPT-4 were recognized and described by the developer 20 and were also reported in other research in the field of medicine. 31,32 As discussed by Lee et al in a recent report in NEJM, 'a false response by GPT-4, referred to as a "hallucination," and such errors can be particularly dangerous in medical scenarios because the errors or falsehoods can be subtle and are often stated by the chatbot in such a convincing manner that the person making the query may be convinced of its veracity'. 33 Furthermore, the AI generated supplementary criteria within its outputs, drawing attention to important aspects of deprescribing, such as the role of patient's involvement and importance of shared decision-making.…”
Section: Discussionmentioning
confidence: 99%
“…False responses by ChatGPT-4 were recognized and described by the developer 20 and were also reported in other research in the field of medicine. 31,32 As discussed by Lee et al in a recent report in NEJM, 'a false response by GPT-4, referred to as a "hallucination," and such errors can be particularly dangerous in medical scenarios because the errors or falsehoods can be subtle and are often stated by the chatbot in such a convincing manner that the person making the query may be convinced of its veracity'. 33 Furthermore, the AI generated supplementary criteria within its outputs, drawing attention to important aspects of deprescribing, such as the role of patient's involvement and importance of shared decision-making.…”
Section: Discussionmentioning
confidence: 99%
“…Others (e.g., a study by Birmaher et al) 6 did not relate to the question, others (e.g., the Schmideberg reference cited in the creativity question) were fabricated, while some (e.g., the WHO reports, 4 and the study by Merikangas et al 5 ) were inaccurate. Inaccurate and fabricated referencing is a well‐known limitation of ChatGPT, 7–9 which more often places emphasis on generating the most plausible sounding references. While ChatGPT was able to generate factual information related to bipolar disorder with reasonable accuracy (with some notable exceptions, such as the specific prevalence rates cited), such information was provided more at the level of a school essay than a scientific journal.…”
Section: Discussionmentioning
confidence: 99%
“…Others (e.g., a study by Birmaher et al) 6 did not relate to the question, others (e.g., the Schmideberg reference cited in the creativity question) were fabricated, while some (e.g., the WHO reports, 4 and the study by Merikangas et al 5 ) were inaccurate. Inaccurate and fabricated referencing is a well-known limitation of ChatGPT, [7][8][9] which more often places emphasis on generating the most plausible sounding references. While…”
Section: Discussionmentioning
confidence: 99%
“…In testing, aiChat was able to adjust the response by date when given this parameter. Given that studies have established that GPT often fabricates references [21,51], we did not specifically ask aiChat to provide references as part of the prompt. Providing an example of the desired output within the prompt has also been suggested [50] and found to improve performance in some analyses [9].…”
Section: Methodsmentioning
confidence: 99%