This paper delves into the realm of human activity recognition (HAR) by leveraging the capabilities of Siamese neural networks (SNNs), focusing on the comparative effectiveness of contrastive and triplet learning approaches. Against the backdrop of HAR’s growing importance in healthcare, sports, and smart environments, the need for advanced models capable of accurately recognizing and classifying complex human activities has become paramount. Addressing this, we have introduced a Siamese network architecture integrated with convolutional neural networks (CNNs) for spatial feature extraction, bidirectional LSTM (Bi-LSTM) for temporal dependency capture, and attention mechanisms to prioritize salient features. Employing both contrastive and triplet loss functions, we meticulously analyze the impact of these learning approaches on the network’s ability to generate discriminative embeddings for HAR tasks. Through extensive experimentation, the study reveals that Siamese networks, particularly those utilizing triplet loss functions, demonstrate superior performance in activity recognition accuracy and F1 scores compared with baseline deep learning models. The inclusion of a stacking meta-classifier further amplifies classification efficacy, showcasing the robustness and adaptability of our proposed model. Conclusively, our findings underscore the potential of Siamese networks with advanced learning paradigms in enhancing HAR systems, paving the way for future research in model optimization and application expansion.