We present SILT, a Self-supervised Implicit Lighting Transfer method. Unlike previous research on scene relighting, we do not seek to apply arbitrary new lighting configurations to a given scene. Instead, we wish to transfer the lighting style from a database of other scenes, to provide a uniform lighting style regardless of the input. The solution operates as a two-branch network that first aims to map input images of any arbitrary lighting style to a unified domain, with extra guidance achieved through implicit image decomposition. We then remap this unified input domain using a discriminator that is presented with the generated outputs and the style reference, i.e. images of the desired illumination conditions. Our method is shown to outperform supervised relighting solutions across two different datasets without requiring lighting supervision. The code and pre-trained models can be found here.
IntroductionWe propose the problem of lighting transfer where an input image under arbitrary illumination conditions is adapted to match the lighting of a style reference database. Previous approaches to a similar problem -arbitrary image relighting -either make simplifying assumptions about the scene (e.g. a single dominant object or a single light source) or require costly supervision where identical scenes must be captured under a large number of known lighting conditions. In contrast, our lighting transfer approach, SILT, is entirely self-supervised. SILT requires only a training dataset with multiple examples of the same scene and a style reference database. It is not necessary to know the ground truth lighting conditions for the training data, nor is it necessary for the target illumination to be present within the training dataset. This distinction between image relighting and lighting transfer is demonstrated in Fig. 1.The proposed SILT method consists of a two-branch network. During training, the generator sees a number of input images of the same scene under different unknown illuminations. The model attempts to enforce similarity between the illumination conditions of the outputs