Multimodal Fusion Object Detection System for Autonomous Vehicles

Person, Michael; Jensen, Matthew; Smith, Anthony O.; Gutiérrez, Héctor

doi:10.1115/1.4043222

Cited by 24 publications

(7 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Multimodal deep learning models that can ingest pixel data along with other data types (fusion) have been successful in applications outside of medicine, such as autonomous driving and video classification. As an example, a multimodal fusion detection system for autonomous vehicles, that combines visual features from cameras along with data from Light Detection and Ranging (LiDAR) sensors, is able to achieve significantly higher accuracy (3.7% improvement) than a single-modal CNN detection model 21 . Similarly, a multimodal social media video classification pipeline leveraging both visual and textual features increased the classification accuracy to 88.0%, well above single modality neural networks such as Google’s InceptionV3 which reached an accuracy of 76.4% on the same task 22 .…”

Section: Introductionmentioning

confidence: 99%

Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines

et al. 2020

View full text Add to dashboard Cite

Advancements in deep learning techniques carry the potential to make significant contributions to healthcare, particularly in fields that utilize medical imaging for diagnosis, prognosis, and treatment decisions. The current state-of-the-art deep learning models for radiology applications consider only pixel-value information without data informing clinical context. Yet in practice, pertinent and accurate non-imaging data based on the clinical history and laboratory data enable physicians to interpret imaging findings in the appropriate clinical context, leading to a higher diagnostic accuracy, informative clinical decision making, and improved patient outcomes. To achieve a similar goal using deep learning, medical imaging pixel-based models must also achieve the capability to process contextual data from electronic health records (EHR) in addition to pixel data. In this paper, we describe different data fusion techniques that can be applied to combine medical imaging with EHR, and systematically review medical data fusion literature published between 2012 and 2020. We conducted a systematic search on PubMed and Scopus for original research articles leveraging deep learning for fusion of multimodality data. In total, we screened 985 studies and extracted data from 17 papers. By means of this systematic review, we present current knowledge, summarize important results and provide implementation guidelines to serve as a reference for researchers interested in the application of multimodal fusion in medical imaging.

show abstract

Section: Introductionmentioning

confidence: 99%

Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Representative examples of such methodologies are the Sliding Shapes [23], Vote3D [24], VoxelNet [25], MFDS [26], MEGVII [27], 3D FCN [28], Vote3Deep [24], SECOND [29], Patch Refinement [30], Fast Point R-CNN [31], Voxel-FPN [32], PV-RCNN [33], HotSpotNet [67,3DBN [34], Fusion of Fusion Net [35] and Point A 2 NET [36].…”

Section: Data Representation Approachesmentioning

confidence: 99%

“…Several methods, such as Frustum PointNet (F-PointNet) [46], Multi-View 3D Object Detection (MV3D) [51], PointFusion [43], Aggregate View Object Detection network (AVOD) [56], RegiOn Approximation Refinement Network (Roarnet) [44], ContFuse [58], Multimodal Fusion Detection System (MFDS) [26], Multi-task Multi-sensor Fusion (MMF) [59], SCANet [57], PointPainting [45], SIFRNet [48], Complex-Retina [60] and LaserNet++ [61] propose a combination of images and LIDAR data to improve object detection accuracy. It allows for performing object detection in difficult scenarios, such as classifying small objects (pedestrians, cyclists) or distant objects, which is one of the limitations found in LiDAR-based object detection models.…”

Section: A Fusion-based Solutionsmentioning

confidence: 99%

“…Instead of using 2D regions to propose frustums, RoarNet [44] uses a RoarNet 2D network based on 2D CNNs, to propose 3D regions. MFDS also uses 2D object detectors to propose a set of possible objects, but they pair these detections with LiDAR point cloud clusters, which are formed based on Euclidean distance [26]. The approaches presented here, typically use PointNet++ or PointNet to learn point cloud features and segment object instances.…”

Section: A Fusion-based Solutionsmentioning

confidence: 99%

See 1 more Smart Citation

Point-cloud based 3D object detection and classification methods for self-driving applications: A survey and taxonomy

et al. 2021

View full text Add to dashboard Cite

“…The remaining points in the point cloud were divided into clusters according to the euclidean distance. Thus, the length of the object, its distance from the center and its direction were determined [26]. Babak et al proposed a new multi-sensor fusion pipeline configuration for object detection and tracking.…”

Section: Introductionmentioning

confidence: 99%

CNN based sensor fusion method for real-time autonomous robotics systems

YILDIZ,

DURDU,

KAYABAŞI

et al. 2022

Turkish Journal of Electrical Engineering and Computer Sciences

View full text Add to dashboard Cite

Autonomous Robotic Systems (ARS) serve in many areas of daily life. The sensors have critical importance for these systems. The sensor data obtained from the environment should be as accurate and reliable as possible and correctly interpreted by the autonomous robot. Since sensors have advantages and disadvantages over each other they should be used together to reduce errors. In this study, Convolutional Neural Network (CNN) based sensor fusion was applied to ARS to contribute the autonomous driving. In a real-time application, a camera and LIDAR sensor were tested with these networks. The novelty of this work is that the uniquely collected data set was trained in a new CNN network and sensor fusion was performed between CNN layers. The results showed that CNN based sensor fusion process had more effective than the individual usage of the sensors on the ARS.

show abstract

Multimodal Fusion Object Detection System for Autonomous Vehicles

Cited by 24 publications

References 32 publications

Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines

Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines

Point-cloud based 3D object detection and classification methods for self-driving applications: A survey and taxonomy

CNN based sensor fusion method for real-time autonomous robotics systems

Contact Info

Product

Resources

About