Hand gestures are a sort of nonverbal communication that may be utilized for many diverse purposes, including deaf-mute interaction, robotic manipulation, human-computer interface (HCI), residential management, and healthcare usage. Moreover, most current research uses the artificial intelligence approach effectively to extract dense features from hand gestures. Since most of them used neural network models, the performance of the models influences the modification of the hyperparameter to enhance recognition accuracy. Therefore, our research proposed a capsule neural network, in which the internal computations on the inputs are better encapsulated by transforming the findings into a tiny vector of information outputs. Moreover, to increase the accuracy of recognizing hand gestures, the neural network has been optimized by inserting additional SoftMax layers before the output layer of the CapsNet. Subsequently, the findings of the tests were assessed and then compared. This developed approach has been beneficial across all tests when contrasted against state-of-the-art systems.