SummaryThis study evaluates the efficacy of GPT-4 in screening for Mild Cognitive Impairment (MCI) in the elderly, comparing it with junior neurologists. MCI is a precursor to dementia, presenting a significant public health concern due to the rising global aging population. With over 55 million people affected by dementia worldwide, early detection is essential for timely intervention. Common screening tools, while effective, are resource-intensive, highlighting the need for more efficient methods. The study used an exploratory design with 174 participants, comparing the performance of GPT-4 against three junior neurologists. The GPT-4 model was trained using a set of language analysis indicators to evaluate the severity of MCI. Participants’ test texts and voices were grouped and independently assessed by the neurologists and the GPT-4 model. The neurologists and the GPT-4 model independently assessed the participants’ test corpus. The neurologists assessed both the text and voice of the test, while the GPT model assessed the text only. Results showed that the GPT-4 model had higher accuracy (0.81) compared to the neurologists (ranging from 0.41 to 0.49). GPT-4 demonstrated better discrimination of MCI with significant statistical difference (p < 0.001). The study also developed a clinical risk assessment nomogram based on the top ten weighted features from GPT-4’s analysis, aiding in MCI patient evaluation. In conclusion, the GPT-4 model shows promise as a diagnostic aid for MCI, potentially improving patient outcomes and reducing healthcare burdens. However, its practical applicability in real-world scenarios requires further investigation and clinical validation.