Purpose
Recent advances in machine learning have enabled better understanding of large and complex visual data. Here, we aim to investigate patient outcome prediction with a machine learning method using only an image of tumour sample as an input.
Methods
Utilising tissue microarray (TMA) samples obtained from the primary tumour of patients (
N
= 1299) within a nationwide breast cancer series with long-term-follow-up, we train and validate a machine learning method for patient outcome prediction. The prediction is performed by classifying samples into low or high digital risk score (DRS) groups. The outcome classifier is trained using sample images of 868 patients and evaluated and compared with human expert classification in a test set of 431 patients.
Results
In univariate survival analysis, the DRS classification resulted in a hazard ratio of 2.10 (95% CI 1.33–3.32,
p
= 0.001) for breast cancer-specific survival. The DRS classification remained as an independent predictor of breast cancer-specific survival in a multivariate Cox model with a hazard ratio of 2.04 (95% CI 1.20–3.44,
p
= 0.007). The accuracy (C-index) of the DRS grouping was 0.60 (95% CI 0.55–0.65), as compared to 0.58 (95% CI 0.53–0.63) for human expert predictions based on the same TMA samples.
Conclusions
Our findings demonstrate the feasibility of learning prognostic signals in tumour tissue images without domain knowledge. Although further validation is needed, our study suggests that machine learning algorithms can extract prognostically relevant information from tumour histology complementing the currently used prognostic factors in breast cancer.
Electronic supplementary material
The online version of this article (10.1007/s10549-019-05281-1) contains supplementary material, which is available to authorized users.