Ideally, 360 • imagery could inherit the deep convolutional neural networks (CNNs) already trained with great success on perspective projection images. However, existing methods to transfer CNNs from perspective to spherical images introduce significant computational costs and/or degradations in accuracy. We present the Kernel Transformer Network (KTN) to efficiently transfer convolution kernels from perspective images to the equirectangular projection of 360 • images. Given a source CNN for perspective images as input, the KTN produces a function parameterized by a polar angle and kernel as output. Given a novel 360 • image, that function in turn can compute convolutions for arbitrary layers and kernels as would the source CNN on the corresponding tangent plane projections. Distinct from all existing methods, KTNs allow model transfer: the same model can be applied to different source CNNs with the same base architecture. This enables application to multiple recognition tasks without re-training the KTN. Validating our approach with multiple source CNNs and datasets, we show that KTNs improve the state of the art for spherical convolution. KTNs successfully preserve the source CNN's accuracy, while offering transferability, scalability to typical image resolutions, and, in many cases, a substantially lower memory footprint 1 . els, this approach degrades accuracy-especially for deeper networks-due to an implicit interpolation assumption, as we will explain below. The third approach defines convolution in the spectral domain [11,15], which has significant memory overhead and thus far limited applicability to realworld data. All of the above require retraining to handle a new recognition task.In light of these shortcomings, we propose the Kernel Transformer Network (KTN). The KTN adapts source CNNs trained on perspective images to 360 • images. Instead of learning a new CNN on 360 • images for a specific task, KTN learns a function that takes a kernel in the source CNN as input and transforms it to be applicable to a 360 • image in its equirectangular projection. See Fig. 1 (C). The function accounts for the distortion in 360 • images, returning different transformations depending on both the polar angle θ and the source kernel. The model is trained to reproduce the outputs of the source CNN on the perspective projection for each tangent plane on an arbitrary 360 • image. Hence, KTN learns to behave similarly to the source CNN while avoiding repeated projection of the image.Key highlights of the proposed KTN are its transferability and compactness-both of which owe to our functionbased design. Once trained for a base architecture, the same KTN can transfer multiple source CNNs to 360 • images. For example, having trained a KTN for VGG [36] on Ima-geNet classification, we can transfer the same KTN to run a VGG-based Pascal object detector on 360 • panoramas. This is possible because the KTN takes the source CNN as input rather than embed the CNN kernels into its own parameters (unlike [11, 12, 15, 37, 46]). Further...