Radar-based hand gesture recognition is an important research area that provides suitable support for various applications, such as human-computer interaction and healthcare monitoring. Several deep learning algorithms for gesture recognition using Impulse Radio Ultra-Wide Band (IR-UWB) have been proposed. Most of them focus on achieving high performance, which requires a huge amount of data. The procedure of acquiring and annotating data remains a complex, costly, and time-consuming task. Moreover, processing a large volume of data usually requires a complex model with very large training parameters, high computation, and memory consumption. To overcome these shortcomings, we propose a simple data processing approach along with a lightweight multi-input hybrid model structure to enhance performance. We aim to improve the existing state-of-the-art results obtained using an available IR-UWB gesture dataset consisting of range-time images of dynamic hand gestures. First, these images are extended using the Sobel filter, which generates low-level feature representations for each sample. These represent the gradient images in the x-direction, the y-direction, and both the x- and y-directions. Next, we apply these representations as inputs to a three-input Convolutional Neural Network- Long Short-Term Memory- Support Vector Machine (CNN-LSTM-SVM) model. Each one is provided to a separate CNN branch and then concatenated for further processing by the LSTM. This combination allows for the automatic extraction of richer spatiotemporal features of the target with no manual engineering approach or prior domain knowledge. To select the optimal classifier for our model and achieve a high recognition rate, the SVM hyperparameters are tuned using the Optuna framework. Our proposed multi-input hybrid model achieved high performance on several parameters, including 98.27% accuracy, 98.30% precision, 98.29% recall, and 98.27% F1-score while ensuring low complexity. Experimental results indicate that the proposed approach improves accuracy and prevents the model from overfitting.