Understanding Pedestrian-Vehicle Interactions with Vehicle Mounted Vision: An LSTM Model and Empirical Analysis

Ridel, Daniela A.; Deo, Nachiket; Wolf, Denis F.; Trivedi, Mohan M.

doi:10.1109/ivs.2019.8813798

Cited by 31 publications

(15 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Alternatively, some approaches use temporal convolutional networks for encoding sequences of past locations [12], [13], allowing for faster run-times. In addition to location co-ordinates, some approaches also incorporate auxiliary information such as the head pose of pedestrians [9], [14] while encoding past motion. Many approaches jointly model the past motion of multiple agents in the scene to capture interaction between agents [5], [15], [12], [10], [7], [11].…”

Section: Related Studiesmentioning

confidence: 99%

Scene Compliant Trajectory Forecast With Agent-Centric Spatio-Temporal Grids

Ridel

Deo

Wolf

et al. 2020

IEEE Robot. Autom. Lett.

Self Cite

View full text Add to dashboard Cite

Forecasting long-term human motion is a challenging task due to the non-linearity, multi-modality and inherent uncertainty in future trajectories. The underlying scene and past motion of agents can provide useful cues to predict their future motion. However, the heterogeneity of the two inputs poses a challenge for learning a joint representation of the scene and past trajectories. To address this challenge, we propose a model based on grid representations to forecast agent trajectories. We represent the past trajectories of agents using binary 2-D grids, and the underlying scene as a RGB birds-eye view (BEV) image, with an agent-centric frame of reference. We encode the scene and past trajectories using convolutional layers and generate trajectory forecasts using a Convolutional LSTM (ConvLSTM) decoder. Results on the publicly available Stanford Drone Dataset (SDD) show that our model outperforms prior approaches and outputs realistic future trajectories that comply with scene structure and past motion.

show abstract

Section: Related Studiesmentioning

confidence: 99%

Scene Compliant Trajectory Forecast With Agent-Centric Spatio-Temporal Grids

Ridel

Deo

Wolf

et al. 2020

IEEE Robot. Autom. Lett.

Self Cite

View full text Add to dashboard Cite

show abstract

“…• Secondly, to further simplify the motion modeling process, the long short-term memory (LSTM) technique is selected to build a time-series neural network. Such a network can infer the movement pattern of a pedestrian from various data types [17][18][19][20][21]. This work utilizes this data-driven approach to learn the vehicle-perspective data and predict the relative trajectory of pedestrians.…”

Section: Introductionmentioning

confidence: 99%

Prediction of Pedestrian Risky Level for Intelligent Vehicles

Zhang

Lü

et al. 2020

2020 IEEE Intelligent Vehicles Symposium (IV)

View full text Add to dashboard Cite

In recent years, road safety has attracted significant attention from researchers and practitioners in the intelligent transport systems domain. As one of the most common and vulnerable groups of road users, pedestrians cause great concerns due to their unpredictable behavior and movement, as subtle misunderstandings in vehicle-pedestrian interaction can easily lead to risky situations or collisions. Existing methods use either predefined collision-based models or human-labeling approaches to estimate the pedestrians' risks. These approaches are usually limited by their poor generalization ability and lack of consideration of interactions between the ego vehicle and a pedestrian. This work tackles the listed problems by proposing a Pedestrian Risk Level Prediction (PRLP) system. The system consists of three modules: data collection and processing module, pedestrian trajectory prediction module, and risk level identification module. Firstly, vehicle-perspective pedestrian data are collected. Since the data contains information regarding the movement of both the ego vehicle and pedestrian, it can simplify the prediction of spatiotemporal features in an interaction-aware fashion. Using the long short-term memory model, the pedestrian trajectory prediction module predicts their spatiotemporal features in the subsequent five frames. As the predicted trajectory follows certain interaction and risk patterns, a hybrid clustering and classification method is adopted to explore the risk patterns in the spatiotemporal features and train a risk level classifier using the learned patterns. Upon predicting the spatiotemporal features of pedestrians and identifying the corresponding risk level, the risk patterns between the ego vehicle and pedestrians are determined. Experimental results verified the capability of the PRLP system to predict the risk level of pedestrians, thus supporting the collision risk assessment of intelligent vehicles and providing safety warnings to both vehicles and pedestrians.

show abstract

“…Alternatively, some approaches use temporal convolutional networks for encoding sequences of past locations (LEE et al, 2017;NIKHIL;MORRIS, 2018), allowing for faster run-times. In addition to location coordinates, some approaches also incorporate auxiliary information such as the head pose of pedestrians (HASAN et al, 2018;RIDEL et al, 2019) while encoding past motion.…”

Section: Deep Learningmentioning

confidence: 99%

“…This is most commonly done by sampling generative models such as Generative Adversarial Networks (GANs) (GUPTA et al, 2018;SADEGHIAN et al, 2019; AMIRIAN; HAYET; PETTRÉ, 2019), Variational Autoencoders (VAEs) (LEE et al, 2017) and invertible models (RHINEHART; KITANI; VERNAZA, 2018). Some approaches sample a stochastic policy obtained using imitation learning or inverse reinforcement learning (Li, 2019;TRIVEDI, 2019). Other approaches learn mixture models (CUI et al, 2019;Zyner;Worrall;Nebot, 2019;.…”

Section: Deep Learningmentioning

confidence: 99%

“…When looking at humans and predicting their behaviors inside cities a common pipeline is first detecting them in 2D/3D images, then tracking them among consecutive images (video), by assigning a unique identifier, and then finally predicting their future behavior. The behavior prediction task was tackled in the literature in many forms, by classifying among many possible motion patterns GAVRILA, 2013;KOEHLER et al, 2013;BONNIN et al, 2014;VÖLZ et al, 2015;HASHIMOTO et al, 2015b;KWAK;KO;NAM, 2017) by predicting one future trajectory (QUINTERO et al, 2015;GOLDHAMMER et al, 2015;FERGUSON et al, 2015;SCHULZ;STIEFELHAGEN, 2015a), or by predicting many possible trajectories (GUPTA et al, 2018;SADEGHIAN et al, 2019;AMIRIAN;HAYET;PETTRÉ, 2019;LEE et al, 2017;TRIVEDI, 2019;CUI et al, 2019;Zyner;Worrall;Nebot, 2019).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Scene compliant spatio-temporal multi-modal multi-agent long-term trajectory forecasting

Ridel¹

View full text Add to dashboard Cite

Understanding Pedestrian-Vehicle Interactions with Vehicle Mounted Vision: An LSTM Model and Empirical Analysis

Cited by 31 publications

References 34 publications

Scene Compliant Trajectory Forecast With Agent-Centric Spatio-Temporal Grids

Scene Compliant Trajectory Forecast With Agent-Centric Spatio-Temporal Grids

Prediction of Pedestrian Risky Level for Intelligent Vehicles

Scene compliant spatio-temporal multi-modal multi-agent long-term trajectory forecasting

Contact Info

Product

Resources

About