Background ChatGPT-4 is the latest release of a novel artificial intelligence (AI) chatbot able to answer freely formulated and complex questions. In the near future, ChatGPT could become the new standard for health care professionals and patients to access medical information. However, little is known about the quality of medical information provided by the AI. Objective We aimed to assess the reliability of medical information provided by ChatGPT. Methods Medical information provided by ChatGPT-4 on the 5 hepato-pancreatico-biliary (HPB) conditions with the highest global disease burden was measured with the Ensuring Quality Information for Patients (EQIP) tool. The EQIP tool is used to measure the quality of internet-available information and consists of 36 items that are divided into 3 subsections. In addition, 5 guideline recommendations per analyzed condition were rephrased as questions and input to ChatGPT, and agreement between the guidelines and the AI answer was measured by 2 authors independently. All queries were repeated 3 times to measure the internal consistency of ChatGPT. Results Five conditions were identified (gallstone disease, pancreatitis, liver cirrhosis, pancreatic cancer, and hepatocellular carcinoma). The median EQIP score across all conditions was 16 (IQR 14.5-18) for the total of 36 items. Divided by subsection, median scores for content, identification, and structure data were 10 (IQR 9.5-12.5), 1 (IQR 1-1), and 4 (IQR 4-5), respectively. Agreement between guideline recommendations and answers provided by ChatGPT was 60% (15/25). Interrater agreement as measured by the Fleiss κ was 0.78 (P<.001), indicating substantial agreement. Internal consistency of the answers provided by ChatGPT was 100%. Conclusions ChatGPT provides medical information of comparable quality to available static internet information. Although currently of limited quality, large language models could become the future standard for patients and health care professionals to gather medical information.
BACKGROUND ChatGPT-4 is the latest release of a novel AI chatbot able to answer freely formulated complex questions. It could become the new standard for healthcare professionals and patients to access medical information in the near future. Howerver, little is known about the quality of medical information provided by the AI. OBJECTIVE To analyse the quality of medical information provided by ChatGPT. METHODS Medical information provided by ChatGPT-4 on the five Hepato-Pancreatico-Biliary (HPB) conditions with the hightest global disease burden (GBD) was measured with the 36 items Ensuring Quality Information for Patients (EQIP) tool. Five guideline recommendations per analysed condition were rephrased as a question and input to ChatGPT, and agreement between the guidelines and the AI answer was measured by two authors independently. All queries were repeated three times to measure internal consistency of ChatGPT. RESULTS Five conditions were identified (gallstone disease, pancreatitis, liver cirrhosis, pancreatic cancer and hepatocellular carcinoma). The median (IQR) EQIP score across all conditions was 16 (14.5-18) from a total of 36. Divided by subsection, median (IQR) scores for content, identification and structure data were 10 (9.5-12.5), 1 (1-1), and 4 (4-5), respectively. Agreement between guideline recommendations and answers provided by ChatGPT was 60% (15/25). Inter-rater agreement as measured by Cohens Kappa was 0.83 (95% confidence interval: 0.61– 1.05), indicading a very high level of agreement. Internal consistency of provided answers by Chat GPT was complete (100%). CONCLUSIONS ChatGPT provides medical information of comparable quality to available static internet information. Altough currently of limited quality, larger language models could become the future standard for patients and healthcare professionals to gather medical information. CLINICALTRIAL None
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.