2020 IEEE International Conference on Robotics and Automation (ICRA) 2020
DOI: 10.1109/icra40945.2020.9196716
|View full text |Cite
|
Sign up to set email alerts
|

Any Motion Detector: Learning Class-agnostic Scene Dynamics from a Sequence of LiDAR Point Clouds

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(14 citation statements)
references
References 15 publications
0
14
0
Order By: Relevance
“…It is extended to a multi-task network in order to additionally predict semantic classes and relies on different input data. Filatov et al [9] introduce a recurrent network architecture for the prediction of a velocity grid based on a sequence of lidar point clouds. First, the lidar data is processed with a voxel feature encoding layer [10] to obtain bird's eye view representations, which are aggregated with a convolutional recurrent network layer.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…It is extended to a multi-task network in order to additionally predict semantic classes and relies on different input data. Filatov et al [9] introduce a recurrent network architecture for the prediction of a velocity grid based on a sequence of lidar point clouds. First, the lidar data is processed with a voxel feature encoding layer [10] to obtain bird's eye view representations, which are aggregated with a convolutional recurrent network layer.…”
Section: Related Workmentioning
confidence: 99%
“…These feature maps are then processed in a feature pyramid and a flow network to predict the scene flow in a 2D grid. Compared to dynamic occupancy grid maps, the work in [9] and [11] focus on the velocity estimation, but do not model freespace and occlusions. Wu et al [13] introduce a spatio-temporal network architecture to predict a 2D grid encoding motion and semantic for each cell, based on 3D lidar point clouds.…”
Section: Related Workmentioning
confidence: 99%
“…In this work, we further develop this approach and propose our method for scenarios with moving ego-vehicle. In a concurrent approach, Filatov et al [17] propose a novel architecture to estimate class-agnostic scene dynamics as grid representation using a sequence of lidar point clouds as input data, which is first processed in voxel feature encoding layers [18]. Afterwards these features are aggregated in a convolutional recurrent network layer with ego-motion compensation and the last hidden state is processed in a ResNet18-FPN backbone network to finally predict a segmentation and velocity grid.…”
Section: Related Workmentioning
confidence: 99%
“…input placement and recurrent states shifting, are not sufficient to achieve an ego-motion compensation on its own for our architecture. The solely shifting of the recurrent states, as applied in [16], [17] is only applicable, if all recurrent layers have the same grid cell size. We argue, that this is a strong limitation for the insertion of recurrent layers in fully convolutional network architectures, as most of them use network layers with different scales, e.g.…”
Section: Dynamic Grid Mapping With Moving Ego-vehiclementioning
confidence: 99%
“…Another possibility to represent and estimate motion is based on bird's eye view (BEV). In this way, a point cloud is discretized into grid cells, and motion information is described by encoding each cell with a 2D displacement vector indicating the position into the future of the cell on the ground plane [8,17,39]. This compact representation successfully simplifies scene motion as the motion taking place on the ground plane is the primary concern for autonomous driving, while the motion in the vertical direction is not as much important or useful.…”
Section: Introductionmentioning
confidence: 99%