Whenever a visual perception system is employed in safety-critical applications such as automated driving, a thorough, task-oriented experimental evaluation is necessary to guarantee safe system behavior. While most standard evaluation methods in computer vision provide a good comparability on benchmarks, they tend to fall short on assessing the system performance that is actually relevant for the given task. In our work, we consider pedestrian detection as a highly relevant perception task, and we argue that standard measures such as Intersection over Union (IoU) give insufficient results, mainly because they are insensitive to important physical cues including distance, speed, and direction of motion. Therefore, we investigate so-called relevance metrics, where specific domain knowledge is exploited to obtain a task-oriented performance measure focusing on distance in this initial work. Our experimental setup is based on the CARLA simulator and allows a controlled evaluation of the impact of that domain knowledge. Our first results indicate a linear decrease of the IoU related to the pedestrians' distance, leading to the proposal of a first relevance metric that is also conditioned on the distance.
The evaluation of camera-based perception functions in automated driving (AD) is a significant challenge and requires large-scale high-quality datasets. Recently proposed metrics for safety evaluation additionally require detailed per-instance annotations of dynamic properties such as distance and velocities that may not be available in openly accessible AD datasets. Synthetic data from 3D simulators like CARLA may provide a solution to this problem as labeled data can be produced in a structured manner. However, CARLA currently lacks instance segmentation ground truth. In this paper, we present a back projection pipeline that allows us to obtain accurate instance segmentation maps for CARLA, which is necessary for precise per-instance ground truth information. Our evaluation results show that per-pedestrian depth aggregation obtained from our instance segmentation is more precise than previously available approximations based on bounding boxes especially in the context of crowded scenes in urban automated driving.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.