Outcomes for cancer patients have been previously estimated by applying various machine learning techniques to large datasets such as the Surveillance, Epidemiology, and End Results (SEER) program database. In particular for lung cancer, it is not well understood which types of techniques would yield more predictive information, and which data attributes should be used in order to determine this information. In this study, a number of supervised learning techniques is applied to the SEER database to classify lung cancer patients in terms of survival, including linear regression, Decision Trees, Gradient Boosting Machines (GBM), Support Vector Machines (SVM), and a custom ensemble. Key data attributes in applying these methods include tumor grade, tumor size, gender, age, stage, and number of primaries, with the goal to enable comparison of predictive power between the various methods. The prediction is treated like a continuous target, rather than a classification into categories, as a first step towards improving survival prediction. The results show that the predicted values agree with actual values for low to moderate survival times, which constitute the majority of the data. The best performing technique was the custom ensemble with a Root Mean Square Error (RMSE) value of 15.05. The most influential model within the custom ensemble was GBM, while Decision Trees may be inapplicable as it had too few discrete outputs. The results further show that among the five individual models generated, the most accurate was GBM with an RMSE value of 15.32. Although SVM underperformed with an RMSE value of 15.82, statistical analysis singles the SVM as the only model that generated a distinctive output. The results of the models are consistent with a classical Cox proportional hazards model used as a reference technique. We conclude that application of these supervised learning techniques to lung cancer data in the SEER database may be of use to estimate patient survival time with the ultimate goal to inform patient care decisions, and that the performance of these techniques with this particular dataset may be on par with that of classical methods.
IMPORTANCE Deep learning–based methods, such as the sliding window approach for cropped-image classification and heuristic aggregation for whole-slide inference, for analyzing histological patterns in high-resolution microscopy images have shown promising results. These approaches, however, require a laborious annotation process and are fragmented. OBJECTIVE To evaluate a novel deep learning method that uses tissue-level annotations for high-resolution histological image analysis for Barrett esophagus (BE) and esophageal adenocarcinoma detection. DESIGN, SETTING, AND PARTICIPANTS This diagnostic study collected deidentified high-resolution histological images (N = 379) for training a new model composed of a convolutional neural network and a grid-based attention network. Histological images of patients who underwent endoscopic esophagus and gastroesophageal junction mucosal biopsy between January 1, 2016, and December 31, 2018, at Dartmouth-Hitchcock Medical Center (Lebanon, New Hampshire) were collected. MAIN OUTCOMES AND MEASURES The model was evaluated on an independent testing set of 123 histological images with 4 classes: normal, BE-no-dysplasia, BE-with-dysplasia, and adenocarcinoma. Performance of this model was measured and compared with that of the current state-of-the-art sliding window approach using the following standard machine learning metrics: accuracy, recall, precision, and F1 score. RESULTS Of the independent testing set of 123 histological images, 30 (24.4%) were in the BE-nodysplasia class, 14 (11.4%) in the BE-with-dysplasia class, 21 (17.1%) in the adenocarcinoma class, and 58 (47.2%) in the normal class. Classification accuracies of the proposed model were 0.85 (95% CI, 0.81–0.90) for the BE-no-dysplasia class, 0.89 (95% CI, 0.84–0.92) for the BE-with-dysplasia class, and 0.88 (95% CI, 0.84–0.92) for the adenocarcinoma class. The proposed model achieved a mean accuracy of 0.83 (95% CI, 0.80–0.86) and marginally outperformed the sliding window approach on the same testing set. The F1 scores of the attention-based model were at least 8% higher for each class compared with the sliding window approach: 0.68 (95% CI, 0.61–0.75) vs 0.61 (95% CI, 0.53–0.68) for the normal class, 0.72 (95% CI, 0.63–0.80) vs 0.58 (95% CI, 0.45–0.69) for the BE-nodysplasia class, 0.30 (95% CI, 0.11–0.48) vs 0.22 (95% CI, 0.11–0.33) for the BE-with-dysplasia class, and 0.67 (95% CI, 0.54–0.77) vs 0.58 (95% CI, 0.44–0.70) for the adenocarcinoma class. However, this outperformance was not statistically significant. CONCLUSIONS AND RELEVANCE Results of this study suggest that the proposed attention-based deep neural network framework for BE and esophageal adenocarcinoma detection is important because it is based solely on tissue-level annotations, unlike existing methods that are based on regions of interest. This new model is expected to open avenues for applying deep learning to digital pathology.
Are deep neural networks trained on data from a single institution for classification of colorectal polyps on digitized histopathology slides generalizable across multiple external institutions? Findings: A new deep neural network was developed based on 326 slide images from our institution to classify the four most common polyp types on digitized histopathology slides. In addition to evaluation on an internal test set of 157 slide images, we evaluated the model on an external test set of 238 slide images from 24 institutions across 13 states in the United States.This model achieved mean accuracies of 93.5% and 87.0% on the internal and external test sets, respectively, which were comparable with the performance of local pathologists on these test sets.Meaning: Deep neural networks could provide a generalizable approach for the classification of colorectal polyps on digitized histopathology slides and if confirmed in clinical trials, could potentially improve the efficiency, reproducibility, and accuracy of one of the most common cancer screening procedures.
The epithelial-to-mesenchymal transition (EMT) is frequently co-opted by cancer cells to enhance migratory and invasive cell traits. It is a key contributor to heterogeneity, chemoresistance, and metastasis in many carcinoma types, where the intermediate EMT state plays a critical tumor-initiating role. We isolate multiple distinct single-cell clones from the SUM149PT human breast cell line spanning the EMT spectrum having diverse migratory, tumor-initiating, and metastatic qualities, including three unique intermediates. Using a multiomics approach, we identify CBFβ as a key regulator of metastatic ability in the intermediate state. To quantify epithelial-mesenchymal heterogeneity within tumors, we develop an advanced multiplexed immunostaining approach using SUM149-derived orthotopic tumors and find that the EMT state and epithelial-mesenchymal heterogeneity are predictive of overall survival in a cohort of stage III breast cancer. Our model reveals previously unidentified insights into the complex EMT spectrum and its regulatory networks, as well as the contributions of epithelial-mesenchymal plasticity (EMP) in tumor heterogeneity in breast cancer.
Heterogeneities in the perfusion of solid tumors prevent optimal delivery of nanotherapeutics. Clinical imaging protocols to obtain patient-specific data have proven difficult to implement. It is challenging to determine which perfusion features hold greater prognostic value and to relate measurements to vessel structure and function. With the advent of systemically administered nanotherapeutics, whose delivery is dependent on overcoming diffusive and convective barriers to transport, such knowledge is increasingly important. We describe a framework for the automated evaluation of vascular perfusion curves measured at the single vessel level. Primary tumor fragments, collected from triple-negative breast cancer patients and grown as xenografts in mice, were injected with fluorescence contrast and monitored using intravital microscopy. The time to arterial peak and venous delay, two features whose probability distributions were measured directly from time-series curves, were analyzed using a Fuzzy C-mean (FCM) supervised classifier in order to rank individual tumors according to their perfusion characteristics. The resulting rankings correlated inversely with experimental nanoparticle accumulation measurements, enabling modeling of nanotherapeutics delivery without requiring any underlying assumptions about tissue structure or function, or heterogeneities contained within. With additional calibration, these methodologies may enable the study of nanotherapeutics delivery strategies in a variety of tumor models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.