Background:
Artificial intelligence (AI) technologies, particularly large language models (LLMs), have been widely employed by the medical community. In addressing the intricacies of urology, ChatGPT offers a novel possibility to aid in clinical decision-making. This study aimed to investigate the decision-making ability of LLMs in solving complex urology-related problems and assess its effectiveness in providing psychological support to patients with urological disorders.
Materials and Methods:
This study evaluated the clinical and psychological support capabilities of ChatGPT 3.5 and 4.0 in the field of urology. A total of 69 clinical and 30 psychological questions were posed to the AI models, and their responses were evaluated by both urologists and psychologists. As a control, clinicians from Chinese medical institutions provided responses under closed-book conditions. Statistical analyses were conducted separately for each subgroup.
Results:
In multiple-choice tests covering diverse urological topics, ChatGPT 4.0, performed comparably to the physician group, with no significant overall score difference. Subgroup analyses revealed variable performance, based on disease type and physician experience, with ChatGPT 4.0 generally outperforming ChatGPT 3.5 and exhibiting competitive results against physicians. When assessing the psychological support capabilities of AI, it is evident that ChatGPT4.0 outperforms ChatGPT3.5 across all urology-related psychological problems.
Conclusions:
The performance of LLMs in dealing with standardized clinical problems and providing psychological support has certain advantages over clinicians. AI stands out as a promising tool for potential clinical aid.