Deep learning methods for digital pathology analysis are an effective way to address multiple clinical questions, from diagnosis to prediction of treatment outcomes. These methods have also been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed. We show that HE2RNA, a model based on the integration of multiple data modes, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without expert annotation. Through its interpretable design, HE2RNA provides virtual spatialization of gene expression, as validated by CD3-and CD20-staining on an independent dataset. The transcriptomic representation learned by HE2RNA can also be transferred on other datasets, even of small size, to increase prediction performance for specific molecular phenotypes. We illustrate the use of this approach in clinical diagnosis purposes such as the identification of tumors with microsatellite instability.
BaCKgRoUND aND aIMS: Standardized and robust risk-stratification systems for patients with hepatocellular carcinoma (HCC) are required to improve therapeutic strategies and investigate the benefits of adjuvant systemic therapies after curative resection/ablation. appRoaCH aND ReSUltS: In this study, we used two deep-learning algorithms based on whole-slide digitized histological slides (whole-slide imaging; WSI) to build models for predicting survival of patients with HCC treated by surgical resection. Two independent series were investigated: a discovery set (Henri Mondor Hospital, n = 194) used to develop our algorithms and an independent validation set (The Cancer Genome Atlas [TCGA], n = 328). WSIs were first divided into small squares ("tiles"), and features were extracted with a pretrained convolutional neural network (preprocessing step). The first deep-learning-based algorithm ("SCHMOWDER") uses an attention mechanism on tumoral areas annotated by a pathologist whereas the second ("CHOWDER") does not require human expertise. In the discovery set, c-indices for survival prediction of SCHMOWDER and CHOWDER reached 0.78 and 0.75, respectively. Both models outperformed a composite score incorporating all baseline variables associated with survival. Prognostic value of the models was further validated in the TCGA data set, and, as observed in the discovery series, both models had a higher discriminatory power than a score combining all baseline variables associated with survival. Pathological review showed that the tumoral areas most predictive of poor survival were characterized by vascular spaces, the macrotrabecular architectural pattern, and a lack of immune infiltration. CoNClUSIoNS: This study shows that artificial intelligence can help refine the prediction of HCC prognosis. It highlights the importance of pathologist/machine interactions for the construction of deep-learning algorithms that benefit from expert knowledge and allow a biological understanding of their output.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.