According to the application scenarios of the size of the human flow in different consumption places, to solve the problem of crowd detection, distance estimation between crowds, and the inability to monitor and calculate the human flow in real-time, this paper designs a real-time crowd detection scheme for the application scenarios where consumers pay attention to the size of the human flow in consumption places. The main use of the YOLO algorithm with the Darknet53 network as the main network is to separate pedestrians from the background. Pedestrians’ central two-dimensional coordinates are converted into three-dimensional coordinates, realizing crowd detection and, apart from the distance estimation of crowds, real-time monitoring of current regional traffic and flow density and solving the problem of being unable to monitor and calculate people in real-time. It can be applied to many aspects, such as shop rating, traffic control, and flow control of scenic spots. Existing monitors are affected by different lights and cannot provide accurate data. In addition, the processing algorithm of this scheme is stable and accurate, and preprocessing is done before judging the human flow and the position of the human body to reduce the interference of light. This scheme has the performance of real-time monitoring and calculation through experimental verification.