Human action recognition has a wide range of applications, including Ambient Intelligence systems and user assistance. Starting from the recognized actions performed by the user, a better human–computer interaction can be achieved, and improved assistance can be provided by social robots in real-time scenarios. In this context, the performance of the prediction system is a key aspect. The purpose of this paper is to introduce a neural network approach based on various types of convolutional layers that can achieve a good performance in recognizing actions but with a high inference speed. The experimental results show that our solution, based on a combination of graph convolutional networks (GCN) and temporal convolutional networks (TCN), is a suitable approach that reaches the proposed goal. In addition to the neural network model, we design a pipeline that contains two stages for obtaining relevant geometric features, data augmentation and data preprocessing, also contributing to an increased performance.