Digital pathology has entered a new era with the availability of whole slide scanners that create the high-resolution images of full biopsy slides. Consequently, the uncertainty regarding the correspondence between the image areas and the diagnostic labels assigned by pathologists at the slide level, and the need for identifying regions that belong to multiple classes with different clinical significances have emerged as two new challenges. However, generalizability of the state-of-the-art algorithms, whose accuracies were reported on carefully selected regions of interest (ROIs) for the binary benign versus cancer classification, to these multi-class learning and localization problems is currently unknown. This paper presents our potential solutions to these challenges by exploiting the viewing records of pathologists and their slide-level annotations in weakly supervised learning scenarios. First, we extract candidate ROIs from the logs of pathologists' image screenings based on different behaviors, such as zooming, panning, and fixation. Then, we model each slide with a bag of instances represented by the candidate ROIs and a set of class labels extracted from the pathology forms. Finally, we use four different multi-instance multi-label learning algorithms for both slide-level and ROI-level predictions of diagnostic categories in whole slide breast histopathology images. Slide-level evaluation using 5-class and 14-class settings showed average precision values up to 81% and 69%, respectively, under different weakly labeled learning scenarios. ROI-level predictions showed that the classifier could successfully perform multi-class localization and classification within whole slide images that were selected to include the full range of challenging diagnostic categories.
We propose a framework for learning feature representations for variable-sized regions of interest (ROIs) in breast histopathology images from the convolutional network properties at patch-level. The proposed method involves fine-tuning a pre-trained convolutional neural network (CNN) by using small fixed-sized patches sampled from the ROIs. The CNN is then used to extract a convolutional feature vector for each patch. The softmax probabilities of a patch, also obtained from the CNN, are used as weights that are separately applied to the feature vector of the patch. The final feature representation of a patch is the concatenation of the class-probability weighted convolutional feature vectors. Finally, the feature representation of the ROI is computed by average pooling of the feature representations of its associated patches. The feature representation of the ROI contains local information from the feature representations of its patches while encoding cues from the class distribution of the patch classification outputs. The experiments show the discriminative power of this representation in a 4-class ROI-level classification task on breast histopathology slides where our method achieved an accuracy of 66.8% on a data set containing 437 ROIs with different sizes.
We propose a virtual staining methodology based on Generative Adversarial Networks to map histopathology images of breast cancer tissue from H&E stain to PHH3 and vice versa. We use the resulting synthetic images to build Convolutional Neural Networks (CNN) for automatic detection of mitotic figures, a strong prognostic biomarker used in routine breast cancer diagnosis and grading. We propose several scenarios, in which CNN trained with synthetically generated histopathology images perform on par with or even better than the same baseline model trained with real images. We discuss the potential of this application to scale the number of training samples without the need for manual annotations.
Digitization of full biopsy slides using the whole slide imaging technology has provided new opportunities for understanding the diagnostic process of pathologists and developing more accurate computer aided diagnosis systems. However, the whole slide images also provide two new challenges to image analysis algorithms. The first one is the need for simultaneous localization and classification of malignant areas in these large images, as different parts of the image may have different levels of diagnostic relevance. The second challenge is the uncertainty regarding the correspondence between the particular image areas and the diagnostic labels typically provided by the pathologists at the slide level. In this paper, we exploit a data set that consists of recorded actions of pathologists while they were interpreting whole slide images of breast biopsies to find solutions to these challenges. First, we extract candidate regions of interest (ROI) from the logs of pathologists' image screenings based on different actions corresponding to zoom events, panning motions, and fixations. Then, we model these ROIs using color and texture features. Next, we represent each slide as a bag of instances corresponding to the collection of candidate ROIs and a set of slide-level labels extracted from the forms that the pathologists filled out according to what they saw during their screenings. Finally, we build classifiers using five different multiinstance multi-label learning algorithms, and evaluate their performances under different learning and validation scenarios involving various combinations of data from three expert pathologists. Experiments that compared the slide-level predictions of the classifiers with the reference data showed average precision values up to 62% when the training and validation data came from the same individual pathologist's viewing logs, and an average precision of 64% was obtained when the candidate ROIs and the labels from all pathologists were combined for each slide.
To guide the choice of treatment, every new breast cancer is assessed for aggressiveness (i.e., graded) by an experienced histopathologist. Typically, this tumor grade consists of three components, one of which is the nuclear pleomorphism score (the extent of abnormalities in the overall appearance of tumor nuclei). The degree of nuclear pleomorphism is subjectively classified from 1 to 3, where a score of 1 most closely resembles epithelial cells of normal breast epithelium and 3 shows the greatest abnormalities. Establishing numerical criteria for grading nuclear pleomorphism is challenging, and inter-observer agreement is poor. Therefore, we studied the use of deep learning to develop fully automated nuclear pleomorphism scoring in breast cancer. The reference standard used for training the algorithm consisted of the collective knowledge of an international panel of 10 pathologists on a curated set of regions of interest covering the entire spectrum of tumor morphology in breast cancer. To fully exploit the information provided by the pathologists, a first-of-its-kind deep regression model was trained to yield a continuous scoring rather than limiting the pleomorphism scoring to the standard three-tiered system. Our approach preserves the continuum of nuclear pleomorphism without necessitating a large data set with explicit annotations of tumor nuclei. Once translated to the traditional system, our approach achieves top pathologist-level performance in multiple experiments on regions of interest and whole-slide images, compared to a panel of 10 and 4 pathologists, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.