Grasping unknown objects in unstructured environments is one of the most challenging and demanding tasks for robotic bin picking systems. Developing a holistic approach is crucial to building such dexterous bin picking systems to meet practical requirements on speed, cost and reliability. Proposed datasets so far focus only on challenging sub-problems and are therefore limited in their ability to leverage the complementary relationship between individual tasks. In this paper, we tackle this holistic data challenge and design MetaGraspNetV2, an allin-one bin picking dataset consisting of (i) a photo-realistic dataset with over 296k images, which has been created through physics-based metaverse synthesis; and (ii) a real-world test dataset with 3.2k images featuring task-specific difficulty levels. Both datasets provide full annotations for amodal panoptic segmentation, object relationship detection, occlusion reasoning, 6-DoF pose estimation, and grasp detection for a parallel-jaw as well as a vacuum gripper. Extensive experiments demonstrate that our dataset outperforms state-of-the-art datasets in object detection, instance segmentation, amodal detection, parallel-jaw grasping, and vacuum grasping. Furthermore, leveraging the potential of our data for building holistic perception systems, we propose a single-shot-multi-pick (SSMP) grasping policy for scene understanding accelerated fast picking in high clutter. SSMP reasons about suitable manipulation orders for blindly picking multiple items given a single image acquisition. Physical robot experiments demonstrate that SSMP effectively speeds up cycle times through reducing image acquisitions by more than 47% while providing better grasp performance compared to state-of-the-art bin picking methods.