Deep neural networks (DNN) have made impressive progress in the interpretation of image data so that it is conceivable and to some degree realistic to use them in safety critical applications like automated driving. From an ethical standpoint, the AI algorithm should take into account the vulnerability of objects or subjects on the street that ranges from “not at all”, e.g. the road itself, to “high vulnerability” of pedestrians. One way to take this into account is to define the cost of confusion of one semantic category with another and use cost-based decision rules for the interpretation of probabilities, which are the output of DNNs. However, it is an open problem how to define the cost structure, who should be in charge to do that, and thereby define what AI-algorithms will actually “see”. As one possible answer, we follow a participatory approach and set up an online survey to ask the public to define the cost structure. We present the survey design and the data acquired along with an evaluation that also distinguishes between perspective (car passenger vs. external traffic participant) and gender. Using simulation based F-tests, we find highly significant differences between the groups. These differences have consequences on the reliable detection of pedestrians in a safety critical distance to the self-driving car. We discuss the ethical problems that are related to this approach and also discuss the problems emerging from human–machine interaction through the survey from a psychological point of view. Finally, we include comments from industry leaders in the field of AI safety on the applicability of survey based elements in the design of AI functionalities in automated driving.
VRUs in a reachable area depending on the ego-car's velocity. Moreover, filtering via the degree of detection, allows for further contextualization in two regards. We measure a segmentation CNN's detection ability of well as visualization tools for the usecase of semantic segmentation in autonomous driving. Our approach present and implement methods that enable an instance based assessment and provide performance metrics as In this work, we therefore focus on pedestrians as VRU that are overlooked by a perception system. We part of the annotation in public domains test data sets, see e.g. (Cordts et al., 2016;Geyer et al., 2020). an instance based assessment, like overlooked vulnerable road users (VRUs). Information on instances is often evident that a measurement of performance based on pixel coverage is insufficient and should be replaced by the class specific asymmetry of importance of confusion events (Chan et al., 2019;Chan et al., 2020), it is confused with a lamppost, or fatal, if a pedestrian is overlooked due to a confusion with the street. Apart from For the example of autonomous driving, errors in perception could either be irrelevant, like if a tree is account the application specific failure modes. or autonomous driving, e.g. (Chen et al., 2015), the measurement of performance necessarily has to take into deployed as perceptive systems in a safety critical application like medical imaging, e.g. (Dong et al., 2017), application context of the computer vision system. If such modern artificial intelligence (AI) methods are (TP), false positive (FP) and false negative (FN) class predictions, it is still agnostic with regard to the Although the IoU as performance metric combines important quantities like the numbers of true positive segmentation masks for a specific category and then averaged over the classes. are computed per semantic category by comparing ground truth segmentation masks with predicted metrics like the commonly-used intersection over union (IoU, also known as Jaccard index (Jaccard, 1912)) generally is problematic if one semantic category of special importance is underrepresented. For this reason, Pixel-wise quantities however do not always distinguish between different semantic classes, which on a test data set, which is then averaged both over pixels and over test samples.(CNNs) (Chen and Zhu et al., 2018;Chollet, 2017;Sandler et al., 2018), one often uses pixel based metrics for the performance of a segmentation algorithm, in most cases realized by deep convolutional neural networks 1. Any pixel in a high resolution image is attributed a class from a pre-defined semantic space. As a metric Semantic segmentation combines the computer vision tasks of object classification and localization, see figure of several redundant systems. As metric to evaluate the performance and reliability of this perceptual function,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.