Human activity recognition (HAR) is a wide research topic in a field of computer science. Improving HAR can lead to massive breakthrough in humanoid robotics, robots used in medicine and in the field of autonomous vehicles. The system that is able to recognise human and its activity without any errors and anomalies would lead to safer and more empathetic autonomous systems. During this research work, multiple neural networks models, with different complexity, are being investigated. Each model is retrained on the proposed unique data set, gathered on automated guided vehicle (AGV) with the latest and the modest sensors used commonly on autonomous vehicles. The best model is picked out based on the final accuracy for action recognition. Best models pipeline is fused with YOLOv3, to enhance the human detection. In addition to pipeline improvement, multiple action direction estimation methods are proposed.