In the era of pervasive smart devices, natural language understanding (NLU) holds a pivotal role for facilitating intelligent interactions and decision-making. Core to NLU are slot filling and intent recognition, essential tasks for comprehending user input. While joint modelling of these tasks has gained prominence, the challenges of realizing efficient joint models on resource-constrained devices have emerged as significant. These devices possess limited computational capacity and real-time requirements, necessitating lightweight and efficient models. In this study, we explore the design of a resource-efficient joint model for slot filling and intent recognition. Through leveraging BERT, , graph neural networks, and mask mechanisms, our model achieves the dual goals of semantic slot prediction and intent classification. We focus on model design, training, and real-time inference, aiming to contribute to the paradigm of resource-constrained natural language understanding. Our investigation demonstrates the efficacy of our approach, even when working with a reduced dataset, underscoring the model's applicability to real-world scenarios with limited resources.