Deep learning has made great advances in the field of image processing, which allows automotive devices to be more widely used in humans’ daily lives than ever before. Nowadays, the mobile robot navigation system is among the hottest topics that researchers are trying to develop by adopting deep learning methods. In this paper, we present a system that allows the mobile robot to localize and navigate autonomously in the accessible areas of an indoor environment. The proposed system exploits the Convolutional Neural Network (CNN) model’s advantage to extract data feature maps for image classification and visual localization, which attempts to precisely determine the location region of the mobile robot focusing on the topological maps of the real environment. The system attempts to precisely determine the location region of the mobile robot by integrating the CNN model and topological map of the robot workspace. A dataset with small numbers of images is acquired from the MYNT EYE camera. Furthermore, we introduce a new loss function to tackle the bounded generalization capability of the CNN model in small datasets. The proposed loss function not only considers the probability of the input data when it is allocated to its true class but also considers the probability of allocating the input data to other classes rather than its actual class. We investigate the capability of the proposed system by evaluating the empirical studies based on provided datasets. The results illustrate that the proposed system outperforms other state-of-the-art techniques in terms of accuracy and generalization capability.