BACKGROUND
Artificial intelligence models like ChatGPT have advanced significantly, with GPT-4o offering improved accuracy and contextual understanding. In healthcare, ChatGPT provides accessible explanations of complex medical concepts, aiding patient education and reducing clinician workload. It is particularly effective in simplifying medical jargon, addressing patient questions, and fostering engagement. However, limitations include misinformation risks, outdated data, and potential biases. For lung cancer, the leading cause of cancer-related deaths globally, patients require reliable, comprehensive, and accessible educational tools, particularly for complex treatments like radiotherapy. ChatGPT offers potential as a supplementary resource to meet these needs, though careful oversight is required to address its shortcomings.
OBJECTIVE
This study aims to evaluate the educational capabilities and limitations of GPT-4 for patients undergoing radiotherapy for lung cancer, focusing on clinician-led assessments of relevance, accuracy, and completeness, patient-led evaluations of educational content, and a readability analysis to assess response accessibility.
METHODS
Eight questions related to lung cancer radiotherapy were posed to GPT-4 (July 2024) via OpenAI’s web interface. Responses were assessed for readability using the Modified Flesch Reading Ease (FRE) Formula and the 4th Vienna Formula (WSTF). Six clinicians (two radiation oncologists, two medical oncologists, and two thoracic surgeons) experienced in the treatment of lung cancer rated relevance, correctness, and completeness on a five-point Likert scale (1 = strongly disagree, 5 = strongly agree). Patients evaluated comprehensibility, accuracy, relevance, trustworthiness, and willingness to use ChatGPT for future medical questions during post-radiotherapy follow-up. Data were analyzed using descriptive statistics (median, mean, standard deviation) in Microsoft Excel (version 2410). Figures were created in Python (version 3.8) using Matplotlib, with data structured in Pandas DataFrames for analysis and visualization.
RESULTS
ChatGPT's responses were classified as "very difficult" or "difficult to read" in the readability analysis using the Modified Flesch Reading Ease (FRE) and the 4th Vienna Formula (WSTF) (FRE: 23.36 ± 11.16, WSTF: 13.81 ± 2.01). Clinicians rated relevance (3.7–4.3), correctness (3.5–4.3), and completeness (3.5–4.2), with ChatGPT's response to the question "What follow-up care is required after radiotherapy for lung cancer?'' scoring highest across all dimensions. Thirty consecutive patients (48 – 87 years, median: 66 years) who received radiotherapy for lung cancer rated clarity highly ("easy to understand": 4.4 ± 0.61), but trustworthiness and usability scored lower ("confidence in information": 4.0 ± 0.84). These results highlight ChatGPT's strengths in accessibility and relevance, with room for improvement in trustworthiness and usability.
CONCLUSIONS
ChatGPT shows promise as a supplementary tool for patient education in radiation oncology, offering clear and relevant information. However, limitations in completeness and trustworthiness necessitate careful review and supplementation by healthcare professionals. Further advancements and standardized evaluation criteria are essential for its effective integration into clinical practice.