Panoptic Segmentation Forecasting

Graber, Colin; Tsai, Grace; Firman, Michael; Brostow, Gabriel J.; Schwing, Alexander G.

doi:10.1109/cvprw53098.2021.00257

Cited by 3 publications

(6 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…3). We exceed some state of the methods, such as F2F and PSF [8] at short-term, while mid-term predictions are very close to the feature-to-feature approach introduced by Luc et al [19]. This confirms our choice of using motion cues, instead of learning different Feature Pyramid Networks.…”

Section: Input Vs Predsupporting

confidence: 80%

“…Different than F2F [19], future predictions are finally generated by jointly training the whole system using Mask R-CNN [9] and Semantic FPN [14], for semantic segmentations and instance segmentations respectively. Graber et al [8], proposed a more complete framework to forecast the near future, by decomposing a dynamic scene into things and stuff, i.e. individual objects and background, with multiple training stages and also considering odometry anticipation due to camera motion.…”

Section: Related Workmentioning

confidence: 99%

“…Note that to compute the loss, we use the output of MaskNet after the sigmoid activation and before the final thresholding in order to work with real values in [0, 1]. Since our model works with single instances, we do not rely on a dedicated network to generate semantic segmentations of the image as most state of the art methods [8][30] [18]. Instead, we simply group all object masks together maintaining their category label.…”

Section: Instance Mask Forecastingmentioning

confidence: 99%

“…In this paper we focus on predicting future instance semantic segmentations of moving objects, since we believe it is the most informative representation for a machine planning motion. Most approaches are multimodal [8], integrating multiple modalities to achieve the highest accuracy. In general semantic segmentation [20], optical flow [30] and deep features [19] are considered as a source for future prediction.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Forecasting Future Instance Segmentation with Learned Optical Flow and Warping

Ciamarra

Becattini

Seidenari

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

For an autonomous vehicle it is essential to observe the ongoing dynamics of a scene and consequently predict imminent future scenarios to ensure safety to itself and others. This can be done using different sensors and modalities. In this paper we investigate the usage of optical flow for predicting future semantic segmentations. To do so we propose a model that forecasts flow fields autoregressively. Such predictions are then used to guide the inference of a learned warping function that moves instance segmentations on to future frames. Results on the Cityscapes dataset demonstrate the effectiveness of optical-flow methods.

show abstract

Section: Input Vs Predsupporting

confidence: 80%

Section: Related Workmentioning

confidence: 99%

Section: Instance Mask Forecastingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Forecasting Future Instance Segmentation with Learned Optical Flow and Warping

Ciamarra

Becattini

Seidenari

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Forecasting sequences in real-world settings, particularly from raw sensor measurements, is a complex problem due to the the exponential timespace space dimensionality, the probabilistic nature of the future and the complex dynamics of the scene. Whilst much effort from the research community has been devoted to video forecasting [18], [44], [50], [64] and semantic forecasting [3], [24], [53], [57], depth and ego-motion forecasting have not received the same interest despite their importance. The geometry of the scene is essential for applications such as planning the trajectory of an agent.…”

mentioning

confidence: 99%

Forecasting of depth and ego-motion with transformers and self-supervision

Boulahbal

Voicila

Comport³

2022

2022 26th International Conference on Pattern Recognition (ICPR)

View full text Add to dashboard Cite

This paper addresses the problem of end-to-end self-supervised forecasting of depth and ego motion. Given a sequence of raw images, the aim is to forecast both the geometry and ego-motion using a self supervised photometric loss. The architecture is designed using both convolution and transformer modules. This leverages the benefits of both modules: Inductive bias of CNN, and the multi-head attention of transformers, thus enabling a rich spatio-temporal representation that enables accurate depth forecasting. Prior work attempts to solve this problem using multi-modal input/output with supervised groundtruth data which is not practical since a large annotated dataset is required. Alternatively to prior methods, this paper forecasts depth and ego motion using only self-supervised raw images as input. The approach performs significantly well on the KITTI dataset benchmark with several performance criteria being even comparable to prior non-forecasting self-supervised monocular depth inference methods.

show abstract

A survey on deep learning-based panoptic segmentation

Chen

2022

Digital Signal Processing

View full text Add to dashboard Cite

Panoptic Segmentation Forecasting

Cited by 3 publications

References 50 publications

Forecasting Future Instance Segmentation with Learned Optical Flow and Warping

Forecasting Future Instance Segmentation with Learned Optical Flow and Warping

Forecasting of depth and ego-motion with transformers and self-supervision

A survey on deep learning-based panoptic segmentation

Contact Info

Product

Resources

About