In general image recognition problems, discriminative information often lies in local image patches. For example, most human identity information exists in the image patches containing human faces. The same situation stays in medical images as well. "Bodypart identity" of a transversal slice-which bodypart the slice comes from-is often indicated by local image information, e.g., a cardiac slice and an aorta arch slice are only differentiated by the mediastinum region. In this work, we design a multi-stage deep learning framework for image classification and apply it on bodypart recognition. Specifically, the proposed framework aims at: 1) discover the local regions that are discriminative and non-informative to the image classification problem, and 2) learn a image-level classifier based on these local regions. We achieve these two tasks by the two stages of learning scheme, respectively. In the pre-train stage, a convolutional neural network (CNN) is learned in a multi-instance learning fashion to extract the most discriminative and and non-informative local patches from the training slices. In the boosting stage, the pre-learned CNN is further boosted by these local patches for image classification. The CNN learned by exploiting the discriminative local appearances becomes more accurate than those learned from global image context. The key hallmark of our method is that it automatically discovers the discriminative and non-informative local patches through multi-instance deep learning. Thus, no manual annotation is required. Our method is validated on a synthetic dataset and a large scale CT dataset. It achieves better performances than state-of-the-art approaches, including the standard deep CNN.
Nuclei segmentation is a fundamental task in histopathology image analysis. Typically, such segmentation tasks require significant effort to manually generate accurate pixel-wise annotations for fully supervised training. To alleviate such tedious and manual effort, in this paper we propose a novel weakly supervised segmentation framework based on partial points annotation, i.e., only a small portion of nuclei locations in each image are labeled. The framework consists of two learning stages. In the first stage, we design a semi-supervised strategy to learn a detection model from partially labeled nuclei locations. Specifically, an extended Gaussian mask is designed to train an initial model with partially labeled data. Then, selftraining with background propagation is proposed to make use of the unlabeled regions to boost nuclei detection and suppress false positives. In the second stage, a segmentation model is trained from the detected nuclei locations in a weakly-supervised fashion. Two types of coarse labels with complementary information are derived from the detected points and are then utilized to train a deep neural network. The fully-connected conditional random field loss is utilized in training to further refine the model without introducing extra computational complexity during inference. The proposed method is extensively evaluated on two nuclei segmentation datasets. The experimental results demonstrate that our method can achieve competitive performance compared to the fully supervised counterpart and the state-of-the-art methods while requiring significantly less annotation effort.
Breast carcinoma is the most common cancer among women worldwide that consists of a heterogeneous group of subtype diseases. The whole-slide images (WSIs) can capture the cell-level heterogeneity, and are routinely used for cancer diagnosis by pathologists. However, key driver genetic mutations related to targeted therapies are identified by genomic analysis like high-throughput molecular profiling. In this study, we develop a deep-learning model to predict the genetic mutations and biological pathway activities directly from WSIs. Our study offers unique insights into WSI visual interactions between mutation and its related pathway, enabling a head-to-head comparison to reinforce our major findings. Using the histopathology images from the Genomic Data Commons Database, our model can predict the point mutations of six important genes (AUC 0.68–0.85) and copy number alteration of another six genes (AUC 0.69–0.79). Additionally, the trained models can predict the activities of three out of ten canonical pathways (AUC 0.65–0.79). Next, we visualized the weight maps of tumor tiles in WSI to understand the decision-making process of deep-learning models via a self-attention mechanism. We further validated our models on liver and lung cancers that are related to metastatic breast cancer. Our results provide insights into the association between pathological image features, molecular outcomes, and targeted therapies for breast cancer patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.