In recent years, the use of artificial intelligence (AI) for image and video-based crime detection has gained significant attention from law enforcement agencies and security experts. Indeed, deep learning (DL) models can learn complex patterns from data and help law enforcement agencies save time and resources by automatically identifying and tracking potential criminals. This contributes to make deep investigations and better steer their targets’ searches. Among others, handheld firearms and bladed weapons are the most frequent objects encountered at crime scenes. In this paper, we propose a DL-based surveillance system that can detect the presence of tracked objects, such as handheld firearms and bladed weapons, as well as may proceed to alert authorities regarding eventual threats before an incident occurs. After making a comparison of different DL-based object detection techniques, such as you only look once (YOLO), single shot multibox detector (SSD), or faster region-based convolutional neural networks (R-CNN), YOLO achieves the optimal balance of mean average precision (mAP) and inference speed for real-time prediction. Thus, we retain YOLOv5 for the implementation of our solution.