High levels of microsatellite instability (MSI-H) occurs in about 15% of sporadic colorectal cancer (CRC) and is an important predictive marker for response to immune checkpoint inhibitors. To test the feasibility of a deep learning (DL)-based classifier as a screening tool for MSI status, we built a fully automated DL-based MSI classifier using pathology whole-slide images (WSIs) of CRCs. On small image patches of The Cancer Genome Atlas (TCGA) CRC WSI dataset, tissue/non-tissue, normal/ tumor and MSS/MSI-H classifiers were applied sequentially for the fully automated prediction of the MSI status. The classifiers were also tested on an independent cohort. Furthermore, to test how the expansion of the training data affects the performance of the DL-based classifier, additional classifier trained on both TCGA and external datasets was tested. The areas under the receiver operating characteristic curves were 0.892 and 0.972 for the TCGA and external datasets, respectively, by a classifier trained on both datasets. The performance of the DL-based classifier was much better than that of previously reported histomorphology-based methods. We speculated that about 40% of CRC slides could be screened for MSI status without molecular testing by the DL-based classifier. These results demonstrated that the DL-based method has potential as a screening tool to discriminate molecular alteration in tissue slides.
BACKGROUND Identifying genetic mutations in cancer patients have been increasingly important because distinctive mutational patterns can be very informative to determine the optimal therapeutic strategy. Recent studies have shown that deep learning-based molecular cancer subtyping can be performed directly from the standard hematoxylin and eosin (H&E) sections in diverse tumors including colorectal cancers (CRCs). Since H&E-stained tissue slides are ubiquitously available, mutation prediction with the pathology images from cancers can be a time- and cost-effective complementary method for personalized treatment. AIM To predict the frequently occurring actionable mutations from the H&E-stained CRC whole-slide images (WSIs) with deep learning-based classifiers. METHODS A total of 629 CRC patients from The Cancer Genome Atlas (TCGA-COAD and TCGA-READ) and 142 CRC patients from Seoul St. Mary Hospital (SMH) were included. Based on the mutation frequency in TCGA and SMH datasets, we chose APC , KRAS , PIK3CA , SMAD4 , and TP53 genes for the study. The classifiers were trained with 360 × 360 pixel patches of tissue images. The receiver operating characteristic (ROC) curves and area under the curves (AUCs) for all the classifiers were presented. RESULTS The AUCs for ROC curves ranged from 0.693 to 0.809 for the TCGA frozen WSIs and from 0.645 to 0.783 for the TCGA formalin-fixed paraffin-embedded WSIs. The prediction performance can be enhanced with the expansion of datasets. When the classifiers were trained with both TCGA and SMH data, the prediction performance was improved. CONCLUSION APC , KRAS , PIK3CA , SMAD4 , and TP53 mutations can be predicted from H&E pathology images using deep learning-based classifiers, demonstrating the potential for deep learning-based mutation prediction in the CRC tissue slides.
The manual review of an electroencephalogram (EEG) for seizure detection is a laborious and errorprone process. Thus, automated seizure detection based on machine learning has been studied for decades. Recently, deep learning has been adopted in order to avoid manual feature extraction and selection. In the present study, we systematically compared the performance of different combinations of input modalities and network structures on a fixed window size and dataset to ascertain an optimal combination of input modalities and network structures. The raw time-series EEG, periodogram of the EEG, 2D images of short-time Fourier transform results, and 2D images of raw EEG waveforms were obtained from 5-s segments of intracranial EEGs recorded from a mouse model of epilepsy. A fully connected neural network (FCNN), recurrent neural network (RNN), and convolutional neural network (CNN) were implemented to classify the various inputs. The classification results for the test dataset showed that CNN performed better than FCNN and RNN, with the area under the curve (AUC) for the receiver operating characteristics curves ranging from 0.983 to 0.984, from 0.985 to 0.989, and from 0.989 to 0.993 for FCNN, RNN, and CNN, respectively. As for input modalities, 2D images of raw EEG waveforms yielded the best result with an AUC of 0.993. Thus, CNN can be the most suitable network structure for automated seizure detection when applied to the images of raw EEG waveforms, since CNN can effectively learn a general spatially-invariant representation of seizure patterns in 2D representations of raw EEG.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.