Although extensive research has been carried out on Human Action Recognition (HAR) and HAR has been applied in many fields of computer vision, little studies focus on the operation detection of workers in petrochemical plants. There are thousands of inflammable chemicals and large amount of toxic gas during the production in petrochemical plants, which would cause serious disaster. Thus, we argue that real-time detection of labors’ actions is required to keep their safety and the production efficiency of petrochemical plants. But most of state-of-the-art models of HAR need a lot of training time and most of them have huge model size which makes it is hard to deploy them on edge devices. To solve this problem, this paper proposes an architecture to detect labor’s operation of pipeline valves in petrochemical plant based on YOLOv5. We calculated relative positions of different objects which were detected by YOLOv5 and added an estimation layer behind output layer of YOLOv5, so that we could use our architecture to estimate labors’ actions. Finally, our experiments show that our architecture can estimate the worker’s actions with 97.5% precision. The precision, the recall, and the mAP@0.5 of object detection is 98%, the mAP@0.5:0.95 is 85%.