Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Toyungyernsub, Maneekwan; Yel, Esen; Li, Jiachen; Kochenderfer, Mykel J.

doi:10.1109/iros47612.2022.9981323

Cited by 12 publications

(8 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Mersch et al [3], Sun et al [4], and Toyungyernsub et al [5] develop novel frameworks to extract features and detect dynamic points utilizing spatial and temporal information. Some of these methods use the point cloud format, while others choose to translate point clouds into different representations, such as residual images, to facilitate processing.…”

Section: A Learning-based Methodsmentioning

confidence: 99%

UFOMap: An Efficient Probabilistic 3D Mapping Framework That Embraces the Unknown

Duberg

2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

3D models are an essential part of many robotic applications. In applications where the environment is unknown a-priori, or where only a part of the environment is known, it is important that the 3D model can handle the unknown space efficiently. Path planning, exploration, and reconstruction all fall into this category. In this paper we present an extension to OctoMap which we call UFOMap. UFOMap uses an explicit representation of all three states in the map, i.e., unknown, free, and occupied. This gives, surprisingly, a more memory efficient representation. We provide methods that allow for significantly faster insertions into the octree. Furthermore, UFOMap supports fast queries based on occupancy state using so called indicators and based on location by exploiting the octree structure and bounding volumes. This enables real-time colored octree mapping at high resolution (below 1 cm). UFOMap is contributed as a C++ library that can be used standalone but is also integrated into ROS.

show abstract

Section: A Learning-based Methodsmentioning

confidence: 99%

UFOMap: An Efficient Probabilistic 3D Mapping Framework That Embraces the Unknown

Duberg

2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

show abstract

“…Many existing methods utilize the high-level history state information (e.g., position, velocity) and the context information (e.g., roadgraph/map, context agent trajectory) to forecast future state sequences [5], [7], [8], [10], [29]- [36]. There are two widely used ways to represent the roadgraph information: (a) rasterized top-down view images [29], [36], [37]; and (b) roadgraph vectors [5], [32]. In order to model the interactions between entities, different feature aggregation techniques are employed such as social pooling [35], attention mechanisms [7], and message passing across graphs [8].…”

Section: Related Workmentioning

confidence: 99%

PointMapNet: Point Cloud Feature Map Network for 3D Human Action Recognition

Huang

Zhang

et al. 2023

Symmetry

View full text Add to dashboard Cite

3D human action recognition is crucial in broad industrial application scenarios such as robotics, video surveillance, autonomous driving, or intellectual education, etc. In this paper, we present a new point cloud sequence network called PointMapNet for 3D human action recognition. In PointMapNet, two point cloud feature maps symmetrical to depth feature maps are proposed to summarize appearance and motion representations from point cloud sequences. Specifically, we first convert the point cloud frames to virtual action frames using static point cloud techniques. The virtual action frame is a 1D vector used to characterize the structural details in the point cloud frame. Then, inspired by feature map-based human action recognition on depth sequences, two point cloud feature maps are symmetrically constructed to recognize human action from the point cloud sequence, i.e., Point Cloud Appearance Map (PCAM) and Point Cloud Motion Map (PCMM). To construct PCAM, an MLP-like network architecture is designed and used to capture the spatio-temporal appearance feature of the human action in a virtual action sequence. To construct PCMM, the MLP-like network architecture is used to capture the motion feature of the human action in a virtual action difference sequence. Finally, the two point cloud feature map descriptors are concatenated and fed to a fully connected classifier for human action recognition. In order to evaluate the performance of the proposed approach, extensive experiments are conducted. The proposed method achieves impressive results on three benchmark datasets, namely NTU RGB+D 60 (89.4% cross-subject and 96.7% cross-view), UTD-MHAD (91.61%), and MSR Action3D (91.91%). The experimental results outperform existing state-of-the-art point cloud sequence classification networks, demonstrating the effectiveness of our method.

show abstract

“…Recent works have considered the problem of future egocentric OGM predictions ( [6], [7], [8], [9], [10], [3]) by incorporating spatio-temporal deep-learning methods, involving combinations of convolutional neural networks (CNNs) and recurrent neural networks (RNNs). These works predict the complete scene as OGMs and encounter similar challenges for forecasting such as blurriness, loss of scene Fig.…”

Section: A Occupancy Grid Predictionmentioning

confidence: 99%

“…These grid predictions, however, present challenges of blurry dynamic vehicles and their evaluation. With few common benchmark and direct comparison methods being available, the future prediction of OGMs are mostly evaluated against the actual generated grid ( [3], [4]) rather than the ground truth. This carries the inherent risk of overlooking potential errors in OGMs generation during training and 1 Univ.…”

Section: Introductionmentioning

confidence: 99%

Allo-centric Occupancy Grid Prediction for Urban Traffic Scene Using Video Prediction Networks

Asghar

Rummelhard

Spalanzani

et al. 2022

2022 17th International Conference on Control, Automation, Robotics and Vision (ICARCV)

View full text Add to dashboard Cite

Prediction of dynamic environment is crucial to safe navigation of an autonomous vehicle. Urban traffic scenes are particularly challenging to forecast due to complex interactions between various dynamic agents, such as vehicles and vulnerable road users. Previous approaches have used egocentric occupancy grid maps to represent and predict dynamic environments. However, these predictions suffer from blurriness, loss of scene structure at turns, and vanishing of agents over longer prediction horizon. In this work, we propose a novel framework to make long-term predictions by representing the traffic scene in a fixed frame, referred as allo-centric occupancy grid. This allows for the static scene to remain fixed and to represent motion of the ego-vehicle on the grid like other agents'. We study the allo-centric grid prediction with different video prediction networks and validate the approach on the real-world Nuscenes dataset. The results demonstrate that the allo-centric grid representation significantly improves scene prediction, in comparison to the conventional ego-centric grid approach.

show abstract

Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Cited by 12 publications

References 26 publications

UFOMap: An Efficient Probabilistic 3D Mapping Framework That Embraces the Unknown

UFOMap: An Efficient Probabilistic 3D Mapping Framework That Embraces the Unknown

PointMapNet: Point Cloud Feature Map Network for 3D Human Action Recognition

Allo-centric Occupancy Grid Prediction for Urban Traffic Scene Using Video Prediction Networks

Contact Info

Product

Resources

About