Deep learning based models on the edge devices have received considerable attention as a promising means to handle a variety of AI applications. However, deploying the deep learning models in the production environment with efficient inference on the edge devices is still a challenging task due to computation and memory constraints. This paper proposes a framework for the service robot named GuardBot powered by Jetson Xavier NX and presents a real-world case study of deploying the optimized face mask recognition application with real-time inference on the edge device. It assists the robot to detect whether people are wearing a mask to guard against COVID-19 and gives a polite voice reminder to wear the mask. Our framework contains dual-stage architecture based on convolutional neural networks with three main modules that employ (1) MTCNN for face detection, (2) our proposed CNN model and seven transfer learning based custom models which are Inception-v3, VGG16, denseNet121, resNet50, NASNetMobile, XceptionNet, MobileNet-v2 for face mask classification, (3) TensorRT for optimization of all the models to speedup inference on the Jetson Xavier NX. Our study carries out several analysis based on the models' performance in terms of their frames per second, execution time and images per second. It also evaluates the accuracy, precision, recall & F1-score and makes the comparison of all models before and after optimization with a main focus on high throughput and low latency. Finally, the framework is deployed on a mobile robot to perform experiments in both outdoor and multi-floor indoor environments with patrolling and non-patrolling modes. Compared to other state-of-the-art models, our proposed CNN model for face mask recognition based on the classification obtains 94.5%, 95.9% and 94.28% accuracy on training, validation and testing datasets respectively which is better than MobileNet-v2, Xception and InceptionNet-v3 while it achieves highest throughput and lowest latency than all other models after optimization at different precision levels.