Speech analysis could help develop clinical tools for automatic detection of Alzheimer's disease and monitoring of its progression. However, datasets containing both clinical information and spontaneous speech suitable for statistical learning are relatively scarce. In addition, speech data are often collected under different conditions, such as monologue and dialogue recording protocols. Therefore, there is a need for methods to allow the combination of these scarce resources. In this paper, we propose two feature extraction and representation models, based on neural networks and trained on monologue and dialogue data recorded in clinical settings. These models are evaluated not only for AD recognition, but also with respect to their potential to generalise across both datasets. They provide good results when trained and tested on the same data set (72.56% UAR for monologue data and 85.21% for dialogue). A decrease in UAR is observed in transfer training, where feature extraction models trained on dialogues provide better average UAR on monologues (63.72%) than the other way around (58.94%). When the choice of classifiers is independent of feature extraction, transfer from monologue models to dialogues result in a maximum UAR of 81.04% and transfer from dialogue features to monologue achieve a maximum UAR of 70.73%, evidencing the generalisability of the feature model. Clinical relevance We present a method for automatic screening of cognitive health in dementia risk settings. The method is based on spoken language, an ubiquitous source of data, therefore being cost-efficient, non-invasive and with little infrastructure required.