Accurate distant metastasis (DM) prediction is critical for risk stratification and effective treatment decisions in breast cancer (BC). Many prognostic markers/models based on tissue marker studies are continually emerging using conventional statistical approaches analysing complex/dimensional data association with DM/poor prognosis. However, few of them have fulfilled satisfactory evidences for clinical application. This study aimed at building DM risk assessment algorithm for BC patients.A well-characterised series of early invasive primary operable BC (n=1902), with immunohistochemical (IHC) expression of a panel of biomarkers (n=31) formed the material of this study. Decision tree algorithm was computed using WEKA software, utilising quantitative biomarkers' expression and the absence/presence of distant metastases.Fifteen biomarkers were significantly associated with DM, with six temporal subgroups characterised based on time-to-development of DM ranging from < 1 year to > 15 years of follow-up. Of these 15 biomarkers, 10 had a significant expression pattern where Ki67LI, HER2, p53, N-cadherin, P-cadherin, PIK3CA and TOMM34 showed significantly higher expressions with earlier development of DM. In contrast, higher expressions of ER, PR, and BCL2, were associated with delayed occurrence of DM. DM prediction algorithm was built utilising cases informative for the 15 significant markers. Four risk groups of patients were characterised. Three markers; p53, HER2 and BCL2 predicted the probability of DM, based on software-generated cut-offs, with a precision rate of 81.1% for positive predictive value and 77.3%, for the negative predictive value.This algorithm reiterates the reported prognostic values of these three markers and underscores their central biologic role in BC progression. Further independent validation of this pruned panel of biomarkers is therefore warranted.3