Robotic arms are currently in the spotlight of the industry of future, but their e ciency faces huge challenges. The e cient grasping of the robotic arm, replacing human work, requires visual support. In this paper, we rst propose to augment end-to-end deep learning gasping with a object detection model in order to improve the e ciency of grasp pose prediction. The accurate positon of the object is di cult to obtain in the depth image due to the absent of the label in point cloud in an open environment. In our work, the detection information is fused with the depth image to obtain accurate 3D mask of the point cloud, guiding the classical GraspNet to generate more accurate grippers. The detection-driven 3D mask method allows also to design a priority scheme increasing the adaptability of grasping scenarios. The proposed grasping method is validated on multiple benchmark datasets achieving state-of-the-art performances.