With the development of the economy and society, the demand for social security and stability increases. However, traditional security systems rely too much on human resources and are affected by uncontrollable community security factors. An intelligent security monitoring system can overcome the limitations of traditional systems and save human resources, contributing to public security. To build this system, a RISC-V SoC is first designed in this paper and implemented on the Nexys-Video Artix-7 FPGA. Then, the Linux operating system is transplanted and successfully run. Meanwhile, the driver of related hardware devices is designed independently. After that, three OpenCV-based object detection models including YOLO (You Only Look Once), Haar (Haar-like features), and LBP (Local Binary Pattern) are compared, and the LBP model is chosen to design applications. Finally, the processing speed of 1.25 s per frame is realized to detect and track moving objects. To sum up, we build an intelligent security monitoring system with real-time detection, tracking, and identification functions through hardware and software collaborative design. This paper also proposes a video downsampling technique. Based on this technique, the BRAM resource usage on the hardware side is reduced by 50% and the amount of pixel data that needs to be processed on the software side is reduced by 75%. A video downsampling technology is also proposed in this paper to achieve better video display effects under limited hardware resources. It provides conditions for future function expansion and improves the models’ processing speed. Additionally, it reduces the run time of the application and improves the system performance.