“…As a result, humans are extremely sensitive to movement and rendering artefacts, which gives rise to the well-known uncanny valley in photo-realistic rendering of human appearance. Recently there has been significant progress using deep generative models to synthesise highly realistic images (Goodfellow et al, 2014;Kingma and Welling, 2013;Zhu et al, 2017;Isola et al, 2016;Ulyanov et al, 2016;Ma et al, 2017;Siarohin et al, 2017;Paier et al, 2020) and videos (Vondrick et al, 2016;Tulyakov et al, 2017) of scenes, which is important for applications such as image manipulation, video animation and rendering of virtual environments. Human avatars are typically rendered using detailed, explicit 3D models, which consist of meshes and textures, and animated using tailored motion models to simulate human behaviour and activity.…”