BACKGROUND
Identifying genetic mutations in cancer patients have been increasingly important because distinctive mutational patterns can be very informative to determine the optimal therapeutic strategy. Recent studies have shown that deep learning-based molecular cancer subtyping can be performed directly from the standard hematoxylin and eosin (H&E) sections in diverse tumors including colorectal cancers (CRCs). Since H&E-stained tissue slides are ubiquitously available, mutation prediction with the pathology images from cancers can be a time- and cost-effective complementary method for personalized treatment.
AIM
To predict the frequently occurring actionable mutations from the H&E-stained CRC whole-slide images (WSIs) with deep learning-based classifiers.
METHODS
A total of 629 CRC patients from The Cancer Genome Atlas (TCGA-COAD and TCGA-READ) and 142 CRC patients from Seoul St. Mary Hospital (SMH) were included. Based on the mutation frequency in TCGA and SMH datasets, we chose
APC
,
KRAS
,
PIK3CA
,
SMAD4
, and
TP53
genes for the study. The classifiers were trained with 360 × 360 pixel patches of tissue images. The receiver operating characteristic (ROC) curves and area under the curves (AUCs) for all the classifiers were presented.
RESULTS
The AUCs for ROC curves ranged from 0.693 to 0.809 for the TCGA frozen WSIs and from 0.645 to 0.783 for the TCGA formalin-fixed paraffin-embedded WSIs. The prediction performance can be enhanced with the expansion of datasets. When the classifiers were trained with both TCGA and SMH data, the prediction performance was improved.
CONCLUSION
APC
,
KRAS
,
PIK3CA
,
SMAD4
, and
TP53
mutations can be predicted from H&E pathology images using deep learning-based classifiers, demonstrating the potential for deep learning-based mutation prediction in the CRC tissue slides.