Graphical neural networks (GNNs) offer the opportunity to provide new insights into the diagnosis, prognosis and contributing factors of Alzheimer’s disease (AD). However, previous studies using GNNs on AD did not incorporate tau PET neuroimaging features or, more importantly, have applied class labels based only on clinician diagnosis, which can be inconsistent. This study addresses these limitations by applying cluster-based labelling on heterogeneous data, including tau PET neuroimaging data, before applying GNN for classification. The data comprised sociodemographic, medical/family history, cognitive and functional assessments, apolipoprotein E (APoE) ε4 allele status, and combined tau PET neuroimaging and MRI data. We identified 5 clusters embedded in a 5-dimensional nonlinear UMAP space. Further projection onto a 3-dimensional UMAP space for visualisation, and using association rule and information gain algorithms, the individual clusters were revealed to have specific feature characteristics, specifically with respect to clinical diagnosis of AD, gender, parental history of AD, age, and underlying neurological risk factors. In particular, there was one cluster comprised mainly of clinician diagnosis of AD. We re-labelled the diagnostic classes based around this AD cluster such that AD cases occurred only within this cluster while those AD cases outside of it were re-labelled as a prodromal stage of AD. A post-hoc analysis supported the re-labelling of these cases given their similar brain tau PET levels, lying between those of the remaining AD and non-AD cases. We then trained a GNN model using the re-labelled data and compared the AD classification performance to that trained using clinician diagnostic labels. We found that GNN with re-labelled data had significantly higher average classification accuracy (p=0.011) and less variability (93.2±0.03%) than with clinician labelled data (90.06±0.04%). During model training, the cluster-labelled GNN model also converged faster than that using clinician labels. Overall, our work demonstrates that more objective cluster-based labels are viable for training GNN on heterogeneous data for diagnosing AD and cognitive decline, especially when aided by tau PET neuroimaging data.