ObjectiveAccurately identifying clinical phenotypes from Electronic Health Records (EHRs) provides additional insights into patients’ health, especially when such information is unavailable in structured data. This study evaluates the application of OpenAI’s transformer-based Generative Pre-trained Transformer (GPT)-4 model to identify clinical phenotypes from EHR text in non-small cell lung cancer (NSCLC) patients. The goal is to identify disease stages, treatments and progression utilizing GPT-4, and compare its performance against GPT-3.5-turbo, and two rule-based and machine learning-based methods, namely, scispaCy and medspaCy.Materials and MethodsPhenotypes such as initial cancer stage, initial treatment, evidence of cancer recurrence, and affected organs during recurrence were identified from 13,646 records for 63 NSCLC patients from Washington University in St. Louis, Missouri. The performance of the GPT-4 model is evaluated against GPT-3.5-turbo, medspaCy and scispaCy by comparing precision, recall, and weighted F1 scores.ResultsGPT-4 achieves higher F1 score, precision, and recall compared to medspaCy and scispaCy’s models. GPT-3.5-turbo performs similar to that of GPT-4. GPT models are not constrained by explicit rule requirements for contextual pattern recognition. SpaCy models rely on predefined patterns, leading to their suboptimal performance.Discussion and ConclusionGPT-4 improves clinical phenotype identification due to its robust pre-training and remarkable pattern recognition capability on the embedded tokens. It demonstrates data-driven effectiveness even with limited context in the input. While rule-based models remain useful for some tasks, GPT models offer improved contextual understanding of the text, robust clinical phenotype extraction, and improved ability to provide better care to the patients.