MultiXNet: Multiclass Multistage Multimodal Motion Prediction

Djuric, Nemanja; Cui, Henggang; Su, Zhaoen; Wu, Shangxuan; Wang, Huahua; Chou, Fang‐Chieh; Martín, Luisa; Feng, Shijie; Hu, Rui; Xu, Yang; Dayan, Alyssa; Zhang, Sidney; Becker, Brian C.; Meyer, Gregory P.; Vallespi-Gonzalez, Carlos; Wellington, Carl

doi:10.1109/iv48863.2021.9575718

Cited by 25 publications

(29 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to take the multimodality into account, multiple trajectories can be predicted for an actor [10], [11], [12]. When the uncertainty of the prediction is considered, a spatial probability distribution is provided at each of the given timepoints independently [9], [13]. The mathematical details can also be found in the following section.…”

Section: Related Workmentioning

confidence: 99%

“…Next, we study the proposed representation in supervised trajectory prediction tasks by replacing the waypoint representation with the polynomial representation using (3), and compare prediction performances using the different representations. We adapt MultiXNet [9], which is a deep model with competitive performance designed to detect traffic actors around a SDV and predict their future trajectories.…”

Section: Applying the Representation In Supervised Learningmentioning

confidence: 99%

“…In robotics in general and self-driving vehicle (SDV) applications in particular, anticipating the motion of other actors around the robot plays a critical role in planning safe paths to navigate the environment [1]. Recently, significant improvements have come from exploring the input representation of the sensor data [2], [3], [4], [5], [6] and the neural network structures [7], [8], [9]. Likewise, the output representation for trajectories has seen extensions to account for multimodality [10], [11], [12] and for modeling probability distributions over their future locations [9], [13].…”

Section: Introductionmentioning

confidence: 99%

“…Recently, significant improvements have come from exploring the input representation of the sensor data [2], [3], [4], [5], [6] and the neural network structures [7], [8], [9]. Likewise, the output representation for trajectories has seen extensions to account for multimodality [10], [11], [12] and for modeling probability distributions over their future locations [9], [13]. However, these output representations generally provide predictions of locations only at discrete and prefixed time-points which may also lack the physics constraints that govern object motion in the real world.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Su¹,

Wang²,

Cui³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

A commonly-used representation for motion prediction of actors is a sequence of waypoints (comprising positions and orientations) for each actor at discrete future timepoints. While this approach is simple and flexible, it can exhibit unrealistic higher-order derivatives (such as acceleration) and approximation errors at intermediate time steps. To address this issue we propose a simple and general representation for temporally continuous probabilistic trajectory prediction that is based on polynomial trajectory parameterization. We evaluate the proposed representation on supervised trajectory prediction tasks using two large self-driving data sets. The results show realistic higher-order derivatives and better accuracy at interpolated time-points, as well as the benefits of the inferred noise distributions over the trajectories. Extensive experimental studies based on existing state-of-the-art models demonstrate the effectiveness of the proposed approach relative to other representations in predicting the future motions of vehicle, bicyclist, and pedestrian traffic actors.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Applying the Representation In Supervised Learningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Su¹,

Wang²,

Cui³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Prior works in the field of sensor fusion have mostly focused on the perception aspect of driving, e.g. 2D and 3D object detection [22,12,66,9,44,31,34,61,33,37], motion forecasting [22,36,5,35,63,6,19,38,32,9], and depth estimation [24,60,61,33]. These methods focus on learning a state representation that captures the geometric and semantic information of the 3D scene.…”

Section: Introductionmentioning

confidence: 99%

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Prakash¹,

Chitta²,

Geiger³

2021

Preprint

View full text Add to dashboard Cite

How should representations from complementary sensors be integrated for autonomous driving? Geometrybased sensor fusion has shown great promise for perception tasks such as object detection and motion forecasting. However, for the actual driving task, the global context of the 3D scene is key, e.g. a change in traffic light state can affect the behavior of a vehicle geometrically distant from that traffic light. Geometry alone may therefore be insufficient for effectively fusing representations in end-to-end driving models. In this work, we demonstrate that imitation learning policies based on existing sensor fusion methods under-perform in the presence of a high density of dynamic agents and complex scenarios, which require global contextual reasoning, such as handling traffic oncoming from multiple directions at uncontrolled intersections. Therefore, we propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention. We experimentally validate the efficacy of our approach in urban settings involving complex scenarios using the CARLA urban driving simulator. Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.

show abstract

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally

Akan

Güney

2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

MultiXNet: Multiclass Multistage Multimodal Motion Prediction

Cited by 25 publications

References 26 publications

Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

StretchBEV: Stretching Future Instance Prediction Spatially and Temporally

Contact Info

Product

Resources

About