BackgroundProstate cancer is one of the most common malignant diseases and is characterized by heterogeneity in the clinical course. To date, there are no efficient morphologic features or genomic biomarkers that can characterize the phenotypes of the cancer, especially with regard to metastasis – the most adverse outcome. Searching for effective surrogate genes out of large quantities of gene expression data is a key to cancer phenotyping and/or understanding molecular mechanisms underlying prostate cancer development.ResultsUsing the maximum relevance minimum redundancy (mRMR) method on microarray data from normal tissues, primary tumors and metastatic tumors, we identifed four genes that can optimally classify samples of different prostate cancer phases. Moreover, we constructed a molecular interaction network with existing bioinformatic resources and co-identifed eight genes on the shortest-paths among the mRMR-identified genes, which are potential co-acting factors of prostate cancer. Functional analyses show that molecular functions involved in cell communication, hormone-receptor mediated signaling, and transcription regulation play important roles in the development of prostate cancer.ConclusionWe conclude that the surrogate genes we have selected compose an effective classifier of prostate cancer phases, which corresponds to a minimum characterization of cancer phenotypes on the molecular level. Along with their molecular interaction partners, it is fairly to assume that these genes may have important roles in prostate cancer development; particularly, the un-reported genes may bring new insights for the understanding of the molecular mechanisms. Thus our results may serve as a candidate gene set for further functional studies.