We present a method for supervised, automatic and reliable classification of healthy controls, patients with bipolar disorder and patients with schizophrenia using brain imaging data. The method uses four supervised classification learning machines trained with a stochastic gradient learning rule based on the minimization of Kullback-Leibler divergence and an optimal model complexity search through posterior probability estimation. Prior to classification, given the high dimensionality of functional magnetic resonance imaging data, a dimension reduction stage comprising of two steps is performed: first, a one sample univariate t-test mean difference T score approach is used to reduce the number of significant discriminative functional activated voxels, and then singular value decomposition (SVD) is performed to further reduce the dimension of the input patterns to a number comparable to the limited number of subjects available for each of the three classes. Experimental results using functional brain imaging (fMRI) data include receiver operation characteristic (ROC) curves for the 3-way classifier with area under curve (AUC) values around 0.82, 0.89, and 0.90 for healthy control versus non-healthy, bipolar disorder versus non-bipolar and schizophrenia patients versus nonschizophrenia, binary problems respectively. The average 3-way correct classification rate is in the range of 70 − 72%, for the test set, remaining close to the estimated Bayesian optimal correct classification rate theoretical upper bound of about 80%, estimated from the performance of the 1-nearest neighbor classifier over the same data.