Exercise-induced pulmonary hemorrhage (EIPH) is a common condition in sport horses with negative impact on performance. Cytology of bronchoalveolar lavage fluid by use of a scoring system is considered the most sensitive diagnostic method. Macrophages are classified depending on the degree of cytoplasmic hemosiderin content. The current gold standard is manual grading, which is however monotonous and time-consuming. We evaluated state-of-the-art deep learning-based methods for single cell macrophage classification and compared them against the performance of nine cytology experts and evaluated inter-and intra-observer variability. Additionally, we evaluated object detection methods on a novel data set of 17 completely annotated cytology whole slide images (WSI) containing 78,047 hemosiderophages. Our deep learning-based approach reached a concordance of 0.85, partially exceeding human expert concordance (0.68 to 0.86, mean of 0.73, SD of 0.04). Intra-observer variability was high (0.68 to 0.88) and inter-observer concordance was moderate (Fleiss' kappa = 0.67). Our object detection approach has a mean average precision of 0.66 over the five classes from the whole slide gigapixel image and a computation time of below two minutes. To mitigate the high inter-and intrarater variability, we propose our automated object detection pipeline, enabling accurate, reproducible and quick EIPH scoring in WSI. Patients with pulmonary hemorrhage (P-Hem) suffer from repeated bleeding into the lungs, which can result in dyspnea and if untreated, may have life threatening consequences 1. There are various causes which lead to P-Hem, including drug abuse, premature birth, leukaemia, autoimmune disorders and immunodeficiencies 2-6. In this paper, we focus on a special subtype of P-Hem called exercise-induced pulmonary hemorrhage (EIPH) in horses. Although EIPH also affects healthy human athletes 7 and racing greyhounds 8 , it is diagnosed most commonly in racing horses and causes reduced athletic performance 9-12. The gold standard for diagnosis of P-Hem in humans and equine animals is to perform cytology of bronchoalveolar lavage fluid (BALF) 4,13 using a scoring system as explained by Golde et al. 4. The red blood cells of the bleeding are degraded into an iron-storage complex called hemosiderin by alveolar macrophages. Hemosiderin-laden macrophages are called hemosiderophages. Prior to microscopic evaluation, the cells are extracted by the BALF procedure and stained with Perlss' Prussian Blue 14 or Turnbull's Blue 15 in order to visualise the iron pigments contained in the hemosiderin. According to the commonly used scoring system (macrophages hemosiderin score) by Golde et al. 4 , alveolar macrophages can be distinguished into five grades depending on their hemosiderin content. This scoring system is based on the principle that a higher score correlates with increased alveolar bleeding 16 .