Objective: Complex machine learning classification algorithms using transcriptome data from post-mortem cerebellar tissue of bipolar patients and unaffected controls, have been recently included in pipelines for patient control classification and identification of characteristic biomarkers. Transcriptomic profile differences between patients and controls, can provide useful information about the role of the cerebellum in the pathogenesis of bipolar disorder and mood deregulation and in normal mood regulation and physiology. User-friendly, fully automated machine learning algorithms, using data extracted from established repositories, could achieve extremely high classification scores and disease- related predictive biomarker identification, in very short time frames and scaled down to small datasets, thus facilitating research on mood disorders.
Method: An application of a fully automated machine learning platform, based on the most suitable algorithm selection and relevant set of hyper- parameter values, for classification between patients and controls and the production of models for biosignature selection, is presented. Transcriptome data used for the analysis were downloaded from the BioDataome preprocessed datasets database. The Dataome dataset, derived from the parent Gene Expression Omnibus GSE35974 (2013) and GSE35978 datasets, which have been originally produced from the cerebellar and parietal lobe tissue of deceased bipolar patients and unaffected controls, (from the Stanley Medical Research Institute Neuropathology Consortium and Array collections), using Affymetrix Human Gene 1.0 ST Array. Patient and control groups were closely matched for age and sex .
Results: Bipolar patients have been identified from controls based on the cerebellar transcriptomic profile with AUC 0.929 and Average Precision 0.955. Patients and Controls have been classified in two separated groups with no close to the boundary cases. Using 6 of the characteristic features discovered during the selection process, 99,6% classification accuracy was achieved. The three biomarkers contributing most to the predictive power of the model (92,7%), are also deregulated in temporal lobe epilepsy.
Conclusion: The cerebellar transcriptome of bipolar patients has a discrete profile and can be used for further exploration of the role of this area in health and disease.
93% AUC and 96% Precision were achieved during classification between unaffected controls and patients with Bipolar Disorder.