Artificial intelligence (AI), with its diverse applications across sectors, including healthcare, education, and finance, has brought groundbreaking changes to numerous fields. [1,2] A notable offshoot, natural language processing, empowers computers to understand and produce human language. Among natural language processing tools, large language models stand out. These models, particularly OpenAI̓s (OpenAI, Inc., San Francisco, CA, USA) GPT (Generative Pre-Trained Transformer) series, culminating in GPT-4 in 2023, utilize deep learning to generate human-like text, revolutionizing interfaces such as chatbots. [3][4][5][6] Its capabilities span from Objectives: This study presents the first investigation into the potential of ChatGPT to provide medical consultation for patients undergoing orthopedic interventions, with the primary objective of evaluating ChatGPT's effectiveness in supporting patient self-management during the essential early recovery phase at home. Materials and methods: Seven scenarios, representative of common situations in orthopedics and traumatology, were presented to ChatGPT version 4.0 to obtain advice. These scenarios and ChatGPT̓s responses were then evaluated by 68 expert orthopedists (67 males, 1 female; mean age: 37.9±5.9 years; range, 30 to 59 years), 40 of whom had at least four years of orthopedic experience, while 28 were associate or full professors. Expert orthopedists used a rubric on a scale of 1 to 5 to evaluate ChatGPTʼs advice based on accuracy, applicability, comprehensiveness, and clarity. Those who gave ChatGPT a score of 4 or higher considered its performance as above average or excellent. Results: In all scenarios, the median evaluation scores were at least 4 across accuracy, applicability, comprehensiveness, and communication. As for mean scores, accuracy was the highest-rated dimension at 4.2±0.8, while mean comprehensiveness was slightly lower at 3.9±0.8. Orthopedist characteristics, such as academic title and prior use of ChatGPT, did not influence their evaluation (all p>0.05). Across all scenarios, ChatGPT demonstrated an accuracy of 79.8%, with applicability at 75.2%, comprehensiveness at 70.6%, and a 75.6% rating for communication clarity.
Conclusion:This study emphasizes ChatGPT̓s strengths in accuracy and applicability for home care after orthopedic intervention but underscores a need for improved comprehensiveness. This focused evaluation not only sheds light on ChatGPT̓s potential in specialized medical advice but also suggests its potential to play a broader role in the advancement of public health.