Applications of ChatGPT in Otolaryngology–Head Neck Surgery: A State of the Art Review

Lechien, Jérôme R.; Rameau, Anais

doi:10.1002/ohn.807

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article4

Preprint1

Relationship

Self Cite0

Independent5

Authors

Journals

Cited by 5 publications

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Large Language Model Versus Human‐Generated Thematic Analysis in Otolaryngology Qualitative Research

Morse,

Li,

Albert

et al. 2024

The Laryngoscope

View full text Add to dashboard Cite

Large Language Model Versus Human‐Generated Thematic Analysis in Otolaryngology Qualitative Research

Morse,

Li,

Albert

et al. 2024

The Laryngoscope

View full text Add to dashboard Cite

In Reference to The Comparative Diagnostic Capability of Large Language Models in Otolaryngology

Maniaci,

Lentini,

Boscolo‐Rizzo

et al. 2024

The Laryngoscope

View full text Add to dashboard Cite

ChatGPT‐4 Consistency in Interpreting Laryngeal Clinical Images of Common Lesions and Disorders

Maniaci,

Chiesa‐Estomba,

Lechien

2024

Otolaryngol.--head neck surg.

View full text Add to dashboard Cite

ObjectiveTo investigate the consistency of Chatbot Generative Pretrained Transformer (ChatGPT)‐4 in the analysis of clinical pictures of common laryngological conditions.Study DesignProspective uncontrolled study.SettingMulticenter study.MethodsPatient history and clinical videolaryngostroboscopic images were presented to ChatGPT‐4 for differential diagnoses, management, and treatment(s). ChatGPT‐4 responses were assessed by 3 blinded laryngologists with the artificial intelligence performance instrument (AIPI). The complexity of cases and the consistency between practitioners and ChatGPT‐4 for interpreting clinical images were evaluated with a 5‐point Likert Scale. The intraclass correlation coefficient (ICC) was used to measure the strength of interrater agreement.ResultsForty patients with a mean complexity score of 2.60 ± 1.15. were included. The mean consistency score for ChatGPT‐4 image interpretation was 2.46 ± 1.42. ChatGPT‐4 perfectly analyzed the clinical images in 6 cases (15%; 5/5), while the consistency between GPT‐4 and judges was high in 5 cases (12.5%; 4/5). Judges reported an ICC of 0.965 for the consistency score (P = .001). ChatGPT‐4 erroneously documented vocal fold irregularity (mass or lesion), glottic insufficiency, and vocal cord paralysis in 21 (52.5%), 2 (0.05%), and 5 (12.5%) cases, respectively. ChatGPT‐4 and practitioners indicated 153 and 63 additional examinations, respectively (P = .001). The ChatGPT‐4 primary diagnosis was correct in 20.0% to 25.0% of cases. The clinical image consistency score was significantly associated with the AIPI score (rs = 0.830; P = .001).ConclusionThe ChatGPT‐4 is more efficient in primary diagnosis, rather than in the image analysis, selecting the most adequate additional examinations and treatments.

show abstract

Applications of ChatGPT in Otolaryngology–Head Neck Surgery: A State of the Art Review

Cited by 5 publications

References 39 publications

Large Language Model Versus Human‐Generated Thematic Analysis in Otolaryngology Qualitative Research

Large Language Model Versus Human‐Generated Thematic Analysis in Otolaryngology Qualitative Research

In Reference to The Comparative Diagnostic Capability of Large Language Models in Otolaryngology

ChatGPT‐4 Consistency in Interpreting Laryngeal Clinical Images of Common Lesions and Disorders

Contact Info

Product

Resources

About