2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00587
|View full text |Cite
|
Sign up to set email alerts
|

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
254
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 265 publications
(254 citation statements)
references
References 23 publications
0
254
0
Order By: Relevance
“…The standard formulation of trajectory prediction problem in the literature [10], [12] is in the 2D image space. With observed trajectories of all moving agents in the scene, including persons and vehicles, the task is to predict the moving trajectories of all agents for the next period of time, say 10 seconds, in the near future.…”
Section: A Methods Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…The standard formulation of trajectory prediction problem in the literature [10], [12] is in the 2D image space. With observed trajectories of all moving agents in the scene, including persons and vehicles, the task is to predict the moving trajectories of all agents for the next period of time, say 10 seconds, in the near future.…”
Section: A Methods Overviewmentioning
confidence: 99%
“…Sadeghian et al [11] incorporated scene context to human trajectory prediction based on GAN (Generative Adversarial Network). Reference [12] extracted multiple visual features, including each person's body keypoints and the scene semantic map to predict human behavior and model interaction with the surrounding environment. Reference [3] proposed a Bayesian framework to predict unobserved paths from previously observed motions and to transfer learned motion patterns to new scenes.…”
Section: A Human Trajectory Predictionmentioning
confidence: 99%
“…Datasets like Human 3.6M, Moving MNIST and RobotPush were used for experimentation [9]. For capturing the spatial and temporal features in natural video's ,proposed an Encoder-Decoder CNN and convolutional LSTM model [10]; Proposed a method Deep Voxel flow that consists of a Fully-Convolutional Encoder-Decoder architecture containing with three convolutional and deconvolutional layers [12]; for predicting the video frame used Gated Recurrent units which confluence noise during training phase [13]; This paper focuses on pixel level video prediction Manju D , Seetha M, Sammulal P Frame Prediction-Noise Removal using Denoising Autoencoders C Frame Prediction-Noise Removal using Denoising Autoencoders that uses hierarchical approach [14]; Predicts the Pedestrian's behavior, by extracting the person appearance and pose features [16]; Proposed an SDC module for learning motion vector of objects in the frame prediction and a kernel for synthesizing the pixels [17,31]; Proposed a Meta-Learning an unsupervised learning rule for training networks [18]; Focused on action prediction sequences, for this proposed global-local temporal action prediction model [24,23,33]. Intuitively, all the last few predicted frames from different models suffer from blurriness.…”
Section: Literature Surveymentioning
confidence: 99%
“…Conventional methods focus on using egocentric features, e.g., previous trajectory, for trajectory prediction, which lacks considering some surrounding information, i.e., multi-agents, in traffic environments. For example, some works in trajectory prediction mainly focus on road-agents in homogeneous environments [2,4,10,11,21,30,31,34,39], which only consists of a single type of road-agent in a scene. However, in real driving environment, it is necessary to differentiate the interactions among different types of road-agents such as the difference between pedestrians and bikes or bikes and trucks.…”
Section: Introductionmentioning
confidence: 99%