Background:Deep learning (DL) is a representation learning approach ideally suited for image analysis challenges in digital pathology (DP). The variety of image analysis tasks in the context of DP includes detection and counting (e.g., mitotic events), segmentation (e.g., nuclei), and tissue classification (e.g., cancerous vs. non-cancerous). Unfortunately, issues with slide preparation, variations in staining and scanning across sites, and vendor platforms, as well as biological variance, such as the presentation of different grades of disease, make these image analysis tasks particularly challenging. Traditional approaches, wherein domain-specific cues are manually identified and developed into task-specific “handcrafted” features, can require extensive tuning to accommodate these variances. However, DL takes a more domain agnostic approach combining both feature discovery and implementation to maximally discriminate between the classes of interest. While DL approaches have performed well in a few DP related image analysis tasks, such as detection and tissue classification, the currently available open source tools and tutorials do not provide guidance on challenges such as (a) selecting appropriate magnification, (b) managing errors in annotations in the training (or learning) dataset, and (c) identifying a suitable training set containing information rich exemplars. These foundational concepts, which are needed to successfully translate the DL paradigm to DP tasks, are non-trivial for (i) DL experts with minimal digital histology experience, and (ii) DP and image processing experts with minimal DL experience, to derive on their own, thus meriting a dedicated tutorial.Aims:This paper investigates these concepts through seven unique DP tasks as use cases to elucidate techniques needed to produce comparable, and in many cases, superior to results from the state-of-the-art hand-crafted feature-based classification approaches.Results:Specifically, in this tutorial on DL for DP image analysis, we show how an open source framework (Caffe), with a singular network architecture, can be used to address: (a) nuclei segmentation (F-score of 0.83 across 12,000 nuclei), (b) epithelium segmentation (F-score of 0.84 across 1735 regions), (c) tubule segmentation (F-score of 0.83 from 795 tubules), (d) lymphocyte detection (F-score of 0.90 across 3064 lymphocytes), (e) mitosis detection (F-score of 0.53 across 550 mitotic events), (f) invasive ductal carcinoma detection (F-score of 0.7648 on 50 k testing patches), and (g) lymphoma classification (classification accuracy of 0.97 across 374 images).Conclusion:This paper represents the largest comprehensive study of DL approaches in DP to date, with over 1200 DP images used during evaluation. The supplemental online material that accompanies this paper consists of step-by-step instructions for the usage of the supplied source code, trained models, and input data.
PURPOSE Digital pathology (DP), referring to the digitization of tissue slides, is beginning to change the landscape of clinical diagnostic workflows and has engendered active research within the area of computational pathology. One of the challenges in DP is the presence of artefacts and batch effects, unintentionally introduced during both routine slide preparation (eg, staining, tissue folding) and digitization (eg, blurriness, variations in contrast and hue). Manual review of glass and digital slides is laborious, qualitative, and subject to intra- and inter-reader variability. Therefore, there is a critical need for a reproducible automated approach of precisely localizing artefacts to identify slides that need to be reproduced or regions that should be avoided during computational analysis. METHODS Here we present HistoQC, a tool for rapidly performing quality control to not only identify and delineate artefacts but also discover cohort-level outliers (eg, slides stained darker or lighter than others in the cohort). This open-source tool employs a combination of image metrics (eg, color histograms, brightness, contrast), features (eg, edge detectors), and supervised classifiers (eg, pen detection) to identify artefact-free regions on digitized slides. These regions and metrics are presented to the user via an interactive graphical user interface, facilitating artefact detection through real-time visualization and filtering. These same metrics afford users the opportunity to explicitly define acceptable tolerances for their workflows. RESULTS The output of HistoQC on 450 slides from The Cancer Genome Atlas was reviewed by two pathologists and found to be suitable for computational analysis more than 95% of the time. CONCLUSION These results suggest that HistoQC could provide an automated, quantifiable, quality control process for identifying artefacts and measuring slide quality, in turn helping to improve both the repeatability and robustness of DP workflows.
Digital histopathology slides have many sources of variance, and while pathologists typically do not struggle with them, computer aided diagnostic algorithms can perform erratically. This manuscript presents Stain Normalization using Sparse AutoEncoders (StaNoSA) for use in standardizing the color distributions of a test image to that of a single template image. We show how sparse autoencoders can be leveraged to partition images into tissue sub-types, so that color standardization for each can be performed independently. StaNoSA was validated on three experiments and compared against five other color standardization approaches and shown to have either comparable or superior results.
Identification of patients with early stage non-small cell lung cancer (NSCLC) with high risk of recurrence could help identify patients who would receive additional benefit from adjuvant therapy. In this work, we present a computational histomorphometric image classifier using nuclear orientation, texture, shape, and tumor architecture to predict disease recurrence in early stage NSCLC from digitized H&E tissue microarray (TMA) slides. Using a retrospective cohort of early stage NSCLC patients (Cohort #1, n = 70), we constructed a supervised classification model involving the most predictive features associated with disease recurrence. This model was then validated on two independent sets of early stage NSCLC patients, Cohort #2 (n = 119) and Cohort #3 (n = 116). The model yielded an accuracy of 81% for prediction of recurrence in the training Cohort #1, 82% and 75% in the validation Cohorts #2 and #3 respectively. A multivariable Cox proportional hazard model of Cohort #2, incorporating gender and traditional prognostic variables such as nodal status and stage indicated that the computer extracted histomorphometric score was an independent prognostic factor (hazard ratio = 20.81, 95% CI: 6.42–67.52, P < 0.001).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.