VRUs in a reachable area depending on the ego-car's velocity. Moreover, filtering via the degree of detection, allows for further contextualization in two regards. We measure a segmentation CNN's detection ability of well as visualization tools for the usecase of semantic segmentation in autonomous driving. Our approach present and implement methods that enable an instance based assessment and provide performance metrics as In this work, we therefore focus on pedestrians as VRU that are overlooked by a perception system. We part of the annotation in public domains test data sets, see e.g. (Cordts et al., 2016;Geyer et al., 2020). an instance based assessment, like overlooked vulnerable road users (VRUs). Information on instances is often evident that a measurement of performance based on pixel coverage is insufficient and should be replaced by the class specific asymmetry of importance of confusion events (Chan et al., 2019;Chan et al., 2020), it is confused with a lamppost, or fatal, if a pedestrian is overlooked due to a confusion with the street. Apart from For the example of autonomous driving, errors in perception could either be irrelevant, like if a tree is account the application specific failure modes. or autonomous driving, e.g. (Chen et al., 2015), the measurement of performance necessarily has to take into deployed as perceptive systems in a safety critical application like medical imaging, e.g. (Dong et al., 2017), application context of the computer vision system. If such modern artificial intelligence (AI) methods are (TP), false positive (FP) and false negative (FN) class predictions, it is still agnostic with regard to the Although the IoU as performance metric combines important quantities like the numbers of true positive segmentation masks for a specific category and then averaged over the classes. are computed per semantic category by comparing ground truth segmentation masks with predicted metrics like the commonly-used intersection over union (IoU, also known as Jaccard index (Jaccard, 1912)) generally is problematic if one semantic category of special importance is underrepresented. For this reason, Pixel-wise quantities however do not always distinguish between different semantic classes, which on a test data set, which is then averaged both over pixels and over test samples.(CNNs) (Chen and Zhu et al., 2018;Chollet, 2017;Sandler et al., 2018), one often uses pixel based metrics for the performance of a segmentation algorithm, in most cases realized by deep convolutional neural networks 1. Any pixel in a high resolution image is attributed a class from a pre-defined semantic space. As a metric Semantic segmentation combines the computer vision tasks of object classification and localization, see figure of several redundant systems. As metric to evaluate the performance and reliability of this perceptual function,