Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Rationale and ObjectivesOver the past year, studies have been conducted to evaluate the performance of Large Language Models (LLMs), such as ChatGPT, in the fields of gynecologic oncology. This review aims to analyze the applications and risks associated with using LLMs in this specialized field.Materials and MethodsThis systematic review was performed in adherence to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, incorporating elements from the diagnostic test accuracy extension and the CHARMS checklist for reviews of prediction models. A systematic literature search was executed on July 17, 2024, across PubMed, Web of Science, and Scopus databases. We focused on identifying original research that integrates LLMs with gynecologic oncology. We assessed the risk of bias using the adapted QUADAS-2 criteria.ResultsOur search identified eight studies that met our criteria, focusing on healthcare education, clinical practice, and medical code generation. These studies revealed variability in ChatGPT’s performance across different applications. It excelled in genetic testing and counseling, achieving 97% accuracy rate. However, its performance in cervical cancer prevention was less robust, with an accuracy of 83%. While one study demonstrated ChatGPT’s high adherence to quality guidelines, another noted that established guidelines significantly outperformed ChatGPT’s outputs. Additionally, code generation using tools like Google Bard and RoBERTa have shown potential to improve accuracy in clinical predictions and quality assurance. For example, Natural Language Processing (NLP) assisted by RoBERTa (based on Google’s BERT model) has improved the prediction of residual disease in women with advanced epithelial ovarian cancer following cytoreductive surgery. Despite these advancements, challenges related to consistency, specificity, and personalization persist, underscoring the necessity for continuous enhancement of these technologies.ConclusionLLMs demonstrate inconsistent performance in gynecologic oncology. These findings emphasize the need for continuous evaluation of these models before they are implemented clinically.
Rationale and ObjectivesOver the past year, studies have been conducted to evaluate the performance of Large Language Models (LLMs), such as ChatGPT, in the fields of gynecologic oncology. This review aims to analyze the applications and risks associated with using LLMs in this specialized field.Materials and MethodsThis systematic review was performed in adherence to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, incorporating elements from the diagnostic test accuracy extension and the CHARMS checklist for reviews of prediction models. A systematic literature search was executed on July 17, 2024, across PubMed, Web of Science, and Scopus databases. We focused on identifying original research that integrates LLMs with gynecologic oncology. We assessed the risk of bias using the adapted QUADAS-2 criteria.ResultsOur search identified eight studies that met our criteria, focusing on healthcare education, clinical practice, and medical code generation. These studies revealed variability in ChatGPT’s performance across different applications. It excelled in genetic testing and counseling, achieving 97% accuracy rate. However, its performance in cervical cancer prevention was less robust, with an accuracy of 83%. While one study demonstrated ChatGPT’s high adherence to quality guidelines, another noted that established guidelines significantly outperformed ChatGPT’s outputs. Additionally, code generation using tools like Google Bard and RoBERTa have shown potential to improve accuracy in clinical predictions and quality assurance. For example, Natural Language Processing (NLP) assisted by RoBERTa (based on Google’s BERT model) has improved the prediction of residual disease in women with advanced epithelial ovarian cancer following cytoreductive surgery. Despite these advancements, challenges related to consistency, specificity, and personalization persist, underscoring the necessity for continuous enhancement of these technologies.ConclusionLLMs demonstrate inconsistent performance in gynecologic oncology. These findings emphasize the need for continuous evaluation of these models before they are implemented clinically.
Background: The Japanese Circulation Society 2022 Guideline on Perioperative Cardiovascular Assessment and Management for Non-Cardiac Surgery standardizes preoperative cardiovascular assessments. The present study investigated the efficacy of a large language model (LLM) in providing accurate responses meeting the JCS 2022 Guideline. Methods and Results: Data on consultation requests, physicians’ cardiovascular records, and patients’ response content were analyzed. Virtual scenarios were created using real-world clinical data, and a LLM was then consulted for such scenarios. Conclusions: Google BARD could accurately provide responses in accordance with the JCS 2022 Guideline in low-risk cases. Google Gemini has significantly improved its accuracy in intermediate- and high-risk cases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.