Breast cancer comprises a group of distinct subtypes that despite having similar histologic appearances, have very different metastatic potentials. Being able to identify the biological driving force, even for a subset of patients, is crucially important given the large population of women diagnosed with breast cancer. Here, we show that within a subset of patients characterized by relatively high estrogen receptor expression for their age, the occurrence of metastases is strongly predicted by a homogeneous gene expression pattern almost entirely consisting of cell cycle genes (5-year odds ratio of metastasis, 24.0; 95% confidence interval, 6.0-95.5). Overexpression of this set of genes is clearly associated with an extremely poor outcome, with the 10-year metastasis-free probability being only 24% for the poor group, compared with 85% for the good group. In contrast, this gene expression pattern is much less correlated with the outcome in other patient subpopulations. The methods described here also illustrate the value of combining clinical variables, biological insight, and machine-learning to dissect biological complexity. Our work presented here may contribute a crucial step towards rational design of personalized treatment. (Cancer Res 2005; 65(10): 4059-66)