For autonomous intelligent systems, 3D object detection can act as a basis for decision making by providing information such as object's size, position and direction to perceive information about surrounding environment. Successful application using robust 3D object detection can hugely impact robotic industry, augmented and virtual reality sectors in the context of Fourth Industrial Revolution (IR4.0). Recently, deep learning has become potential approach for 3D object detection to learn powerful semantic object features for various tasks, i.e., depth map construction, segmentation and classification. As a result, exponential development in the growth of potential methods is observed in recent years. Although, good number of potential efforts have been made to address 3D object detection, a depth and critical review from different viewpoints is still lacking. As a result, comparison among various methods remains challenging which is important to select method for particular application. Based on strong heterogeneity in previous methods, this research aims to alleviate, analyze and systematize related existing research based on challenges and methodologies from different viewpoints to guide future development and evaluation by bridging the gaps using various sensors, i.e., cameras, LiDAR and Pseudo-LiDAR. At first, this research illustrates critical analysis on existing sophisticated methods by identifying six significant key areas based on current scenarios, challenges, and significant problems to be addressed for solution. Next, this research presents strict comprehensive analysis for validating 3D object detection methods based on eight authoritative 3D detection benchmark datasets depending on the size of the datasets and eight validation matrices. Finally, valuable insights of existing challenges are presented for future directions. Overall extensive review proposed in this research can contribute significantly to embark further investigation in multimodal 3D object detection.