In this paper, we seek to automatically identify Hungarian patients suffering from mild cognitive impairment (MCI) or mild Alzheimer’s Disease (mAD) based on their speech transcripts, focusing only on linguistic features. In addition to the features examined in our earlier study, we introduce syntactic, semantic and pragmatic features of spontaneous speech that might affect the detection of dementia. In order to ascertain the most useful features for distinguishing healthy controls, MCI patients and mAD patients, we will carry out a statistical analysis of the data and investigate the significance level of the extracted features among various speaker group pairs and for various speaking tasks. In the second part of the paper, we use this rich feature set as a basis for an effective discrimination among the three speaker groups. In our machine learning experiments, we will analyze the efficacy of each feature group separately. Our model which uses all the features achieves competitive scores, either with or without demographic information (3-class accuracy values: 68–70%, 2-class accuracy values: 77.3–80%). We also analyze how different data recording scenarios affect linguistic features and how they can be productively used when distinguishing MCI patients from healthy controls.