“…Deep generative models, e.g., variational autoencoders (VAEs) [38], are effective tools to model multi-modal data distributions. Most existing work [68,46,6,61,42,71,3] using deep generative models for human motion prediction is focused on the design of the generative model to allow it to effectively learn the data distribution. After the generative model is learned, little attention has been paid to the sampling method used to produce motion samples (predicted future motions) from the pretrained generative model (weights kept fixed).…”