This work tackles the problem of temporally coherent face anonymization in natural video streams. We propose JaGAN, a two-stage system starting with detecting and masking out faces with black image patches in all individual frames of the video. The second stage leverages a privacy-preserving Video Generative Adversarial Network designed to inpaint the missing image patches with artificially generated faces. Our initial experiments reveal that image based generative models are not capable of inpainting patches showing temporal coherent appearance across neighboring video frames. To address this issue we introduce a newly curated video collection, which is made publicly available for the research community along with this paper 1 . We also introduce the Identity Invariance Score IdI as a means to quantify temporal coherency between neighboring frames.