With the growing use of Deep Neural Networks (DNNs) in various safety-critical applications comes an increasing need for Verification and Validation (V&V) of these DNNs. Unlike testing in software engineering, where several established methods exist for V&V, DNN testing is still at an early stage. The data-driven nature of DNNs adds to the complexity of testing them. In the scope of autonomous driving, we showcase our validation method by leveraging objectlevel annotations (object metadata) to test DNNs on a more granular level using human-understandable semantic concepts like gender, shirt colour, age, and illumination. Such an enhanced granularity, as we detail, can prove useful in the construction of closed-loop testing or the investigation of dataset coverage/completeness. Our add-on sensor to the CARLA simulator enables us to generate datasets with this granular metadata. For the task of semantic segmentation for pedestrian detection using DeepLabv3+, we highlight potential insights and challenges that become apparent on this level of granularity. For instance, imbalances within a CARLA generated dataset w.r.t. the pedestrian distribution do not directly carry over into weak spots of the DNN performances and vice versa.