3140 Background: Recently, histological pattern signatures obtained from diagnostic H&E images have been found to predict mutation, biomarker status or outcome. We report here on a novel deep learning based framework designed to identify and extract predictive histological signatures. We have applied this framework in 3 experiments, predicting specifically the microsatellite status (MSS) of colorectal cancer (CRC), breast cancer (BC) micrometastasis in Lymph nodes (LN) and Pathologic Complete Response (pCR) in BC diagnostic biopsies. Methods: Our deep learning based algorithm was trained on histology images at 20X magnification. Algorithms were trained for binary classification for each of the three cohorts. We used 75% of the images for training and test our algorithm on the remaining 25% of the images. Cohort details are as follows: MSS for CRC: 94 patients’ H&E stained tissue images from the Roche internal CRC80 dataset (MSS n =24; MSI n = 70) were used. BC LN: 270 patients’ H&E stained tissue images from the CAMELYON16 dataset ( LN(+) n = 110 ; LN(-), n =160) were used. pCR for BC: 225 patients’ H&E stained tissue images from the Tryphaena Study BO22280, neoadjuvant, Trastuzumab/Pertuzumab chemotherapy combination trial. (pCR=111, non-pCR n=114). Results: We report and assess algorithm performance on each of the cohorts by Area Under the Curve (AUC). Prediction of MSS in the CRC80 status yielded AUC 0.9. Prediction of LN invasion on CAMELYON16 dataset yielded AUC 0.85. Prediction of pCR on the Tryphaena cohort yielded an AUC of 0.8. Conclusions: We present a new approach to generate predictive signatures based on conventional diagnostic H&E images and a novel machine learning framework. The CRC80 and CAMELYON16 cohorts served as a confidence building experiments with predictive features well known by clinicians and visually confirmed. The predictive algorithm for pCR in the Tryphaena cohort yielded both response prediction and the high predictive value FOVs. These included tissue patterns which have not until now been considered to influence on the prediction of pCR.
Recent advances in theoretical Deep Learning have introduced geometric properties that occur during training, past the Interpolation Thresholdwhere the training error reaches zero. We inquire into the phenomena coined Neural Collapse in the intermediate layers of the networks, and emphasize the innerworkings of Nearest Class-Center Mismatch inside the deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, and improves both train metrics and generalization.
Multiple Instance Learning is a form of weakly supervised learning in which the data is arranged in sets of instances called bags with one label assigned per bag. The bag level class prediction is derived from the multiple instances through application of a permutation invariant pooling operator on instance predictions or embeddings. We present a novel pooling operator called Certainty Pooling which incorporates the model certainty into bag predictions resulting in a more robust and explainable model. We compare our proposed method with other pooling operators in controlled experiments with low evidence ratio bags based on MNIST, as well as on a real life histopathology dataset -Camelyon16. Our method outperforms other methods in both bag level and instance level prediction, especially when only small training sets are available. We discuss the rationale behind our approach and the reasons for its superiority for these types of datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.