Objective: Clustering is applied to biomedical datasets to identify meaningful subgroups of patients, proteins, genes, and diseases. Explainable AI (XAI) brings transparency and interpretability to the formation, composition, and quality of these clusters. This study creates a formal explanation space to enhance the interpretability of clusters of neurology phenotypes.
Methods: Subjects with dementia, movement disorders, and multiple sclerosis were clustered by neurological phenotype using spectral methods. To improve the interpretability of the clusters, we created an explanation space that described the data, explained the algorithm, evaluated cluster separation and quality, identified influential features, visualized cluster composition, and assessed biological plausibility.
Results: Text and equations were used to explain clustering algorithms. Cluster quality was evaluated with validity indices. The t-SNE plots illustrate cluster separation. Influential features were identified from SHAP plots. The cluster composition was visualized with heat maps and word clouds. Expert opinion assessed biological relevance. Spectral coclustering yielded clusters with higher validity indices and biological plausibility than spectral biclustering.
Conclusions: When biomedical data undergo simultaneous clustering, a formal explanation space can improve the transparency of the methods and interpretability of the results.