This paper considers the task of appearance-based localization: visual place recognition from omnidirectional images obtained from catadioptric cameras. The focus is on designing an efficient neural network architecture that accurately and reliably recognizes indoor scenes on distorted images from a catadioptric camera, even in self-similar environments with few discernible features. As the target application is the global localization of a low-cost service mobile robot, the proposed solutions are optimized toward being small-footprint models that provide real-time inference on edge devices, such as Nvidia Jetson. We compare several design choices for the neural network-based architecture of the localization system and then demonstrate that the best results are achieved with embeddings (global descriptors) yielded by exploiting transfer learning and fine tuning on a limited number of catadioptric images. We test our solutions on two small-scale datasets collected using different catadioptric cameras in the same office building. Next, we compare the performance of our system to state-of-the-art visual place recognition systems on the publicly available COLD Freiburg and Saarbrücken datasets that contain images collected under different lighting conditions. Our system compares favourably to the competitors both in terms of the accuracy of place recognition and the inference time, providing a cost- and energy-efficient means of appearance-based localization for an indoor service robot.