The automation of surveillance systems, driven by the rapid development of computer vision technology, has significantly enhanced the analysis of surveillance videos, particularly in recognition of human activity, including behavior analysis and violence detection, thereby bolstering public and industrial security. Despite these advancements, detecting and analyzing violent actions remains challenging, especially for real-time surveillance systems with limited computing power. We propose an artificial intelligencebased framework called VD-Net (Violence Detection Network), enabled by Intelligent Internet-of-Things (IIoT) to detect violent behavior in public and private spaces. The model utilizes lightweight special task temporal convolutional network (ST-TCN) blocks and several bottleneck layers to focus on salient features in the input sequence. The learned features passed from the classifier to discriminate between violent and nonviolent actions. Additionally, our system is supposed to trigger an alert if violence is detected, which is then communicated to relevant departments. We tested the effectiveness of the proposed system by conducting experiments on surveillance and non-surveillance datasets and ensured a 1-4 % improvement in State-of-The-Art (SoTA) accuracy.