We introduce a novel, large-scale dataset for microscopy cell annotations. The dataset includes 32 whole slide images (WSI) of canine cutaneous mast cell tumors, selected to include both low grade cases as well as high grade cases. The slides have been completely annotated for mitotic figures and we provide secondary annotations for neoplastic mast cells, inflammatory granulocytes, and mitotic figure look-alikes. Additionally to a blinded two-expert manual annotation with consensus, we provide an algorithm-aided dataset, where potentially missed mitotic figures were detected by a deep neural network and subsequently assessed by two human experts. We included 262,481 annotations in total, out of which 44,880 represent mitotic figures. For algorithmic validation, we used a customized RetinaNet approach, followed by a cell classification network. We find F1-Scores of 0.786 and 0.820 for the manually labelled and the algorithm-aided dataset, respectively. The dataset provides, for the first time, WSIs completely annotated for mitotic figures and thus enables assessment of mitosis detection algorithms on complete WSIs as well as region of interest detection algorithms.
Manual count of mitotic figures, which is determined in the tumor region with the highest mitotic activity, is a key parameter of most tumor grading schemes. It can be, however, strongly dependent on the area selection due to uneven mitotic figure distribution in the tumor section. We aimed to assess the question, how significantly the area selection could impact the mitotic count, which has a known high inter-rater disagreement. On a data set of 32 whole slide images of H&E-stained canine cutaneous mast cell tumor, fully annotated for mitotic figures, we asked eight veterinary pathologists (five board-certified, three in training) to select a field of interest for the mitotic count. To assess the potential difference on the mitotic count, we compared the mitotic count of the selected regions to the overall distribution on the slide. Additionally, we evaluated three deep learning-based methods for the assessment of highest mitotic density: In one approach, the model would directly try to predict the mitotic count for the presented image patches as a regression task. The second method aims at deriving a segmentation mask for mitotic figures, which is then used to obtain a mitotic density. Finally, we evaluated a two-stage object-detection pipeline based on state-of-the-art architectures to identify individual mitotic figures. We found that the predictions by all models were, on average, better than those of the experts. The two-stage object detector performed best and outperformed most of the human pathologists on the majority of tumor cases. The correlation between the predicted and the ground truth mitotic count was also best for this approach (0.963–0.979). Further, we found considerable differences in position selection between pathologists, which could partially explain the high variance that has been reported for the manual mitotic count. To achieve better inter-rater agreement, we propose to use a computer-based area selection for support of the pathologist in the manual mitotic count.
Exercise-induced pulmonary hemorrhage (EIPH) is a common condition in sport horses with negative impact on performance. Cytology of bronchoalveolar lavage fluid by use of a scoring system is considered the most sensitive diagnostic method. Macrophages are classified depending on the degree of cytoplasmic hemosiderin content. The current gold standard is manual grading, which is however monotonous and time-consuming. We evaluated state-of-the-art deep learning-based methods for single cell macrophage classification and compared them against the performance of nine cytology experts and evaluated inter-and intra-observer variability. Additionally, we evaluated object detection methods on a novel data set of 17 completely annotated cytology whole slide images (WSI) containing 78,047 hemosiderophages. Our deep learning-based approach reached a concordance of 0.85, partially exceeding human expert concordance (0.68 to 0.86, mean of 0.73, SD of 0.04). Intra-observer variability was high (0.68 to 0.88) and inter-observer concordance was moderate (Fleiss' kappa = 0.67). Our object detection approach has a mean average precision of 0.66 over the five classes from the whole slide gigapixel image and a computation time of below two minutes. To mitigate the high inter-and intrarater variability, we propose our automated object detection pipeline, enabling accurate, reproducible and quick EIPH scoring in WSI. Patients with pulmonary hemorrhage (P-Hem) suffer from repeated bleeding into the lungs, which can result in dyspnea and if untreated, may have life threatening consequences 1. There are various causes which lead to P-Hem, including drug abuse, premature birth, leukaemia, autoimmune disorders and immunodeficiencies 2-6. In this paper, we focus on a special subtype of P-Hem called exercise-induced pulmonary hemorrhage (EIPH) in horses. Although EIPH also affects healthy human athletes 7 and racing greyhounds 8 , it is diagnosed most commonly in racing horses and causes reduced athletic performance 9-12. The gold standard for diagnosis of P-Hem in humans and equine animals is to perform cytology of bronchoalveolar lavage fluid (BALF) 4,13 using a scoring system as explained by Golde et al. 4. The red blood cells of the bleeding are degraded into an iron-storage complex called hemosiderin by alveolar macrophages. Hemosiderin-laden macrophages are called hemosiderophages. Prior to microscopic evaluation, the cells are extracted by the BALF procedure and stained with Perlss' Prussian Blue 14 or Turnbull's Blue 15 in order to visualise the iron pigments contained in the hemosiderin. According to the commonly used scoring system (macrophages hemosiderin score) by Golde et al. 4 , alveolar macrophages can be distinguished into five grades depending on their hemosiderin content. This scoring system is based on the principle that a higher score correlates with increased alveolar bleeding 16 .
AstractCanine mammary carcinoma (CMC) has been used as a model to investigate the pathogenesis of human breast cancer and the same grading scheme is commonly used to assess tumor malignancy in both. One key component of this grading scheme is the density of mitotic figures (MF). Current publicly available datasets on human breast cancer only provide annotations for small subsets of whole slide images (WSIs). We present a novel dataset of 21 WSIs of CMC completely annotated for MF. For this, a pathologist screened all WSIs for potential MF and structures with a similar appearance. A second expert blindly assigned labels, and for non-matching labels, a third expert assigned the final labels. Additionally, we used machine learning to identify previously undetected MF. Finally, we performed representation learning and two-dimensional projection to further increase the consistency of the annotations. Our dataset consists of 13,907 MF and 36,379 hard negatives. We achieved a mean F1-score of 0.791 on the test set and of up to 0.696 on a human breast cancer dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.