Human–robot teaming (HrT) is being adopted in an increasing range of industries and work environments. Effective HrT relies on the success of complex and dynamic human–robot interaction. Although it may be optimal for robots to possess all the social and emotional skills to function as productive team members, certain cognitive capabilities can enable them to develop attitude-based competencies for optimizing teams. Despite the extensive research into the human–human team structure, the domain of HrT research remains relatively limited. In this sense, incorporating established human–human teaming (HhT) elements may prove practical. One key element is mutual performance monitoring (MPM), which involves the reciprocal observation and active anticipation of team members’ actions within the team setting, fostering enhanced team coordination and communication. By adopting this concept, this study uses ML-based visual action recognition as a potential tool for developing an effective way to monitor the human component in HrT. This study utilizes a data modeling approach on an existing dataset, the “Industrial Human Action Recognition Dataset” (InHARD), curated specifically for human action recognition assembly tasks in industrial environments involving human–robot collaborations. This paper presents the results of this modeling approach in analyzing the dataset to implement a theoretical concept that can be a first step toward enabling the system to adapt dynamically. The outcomes emphasize the significance of implementing state-of-the-art team concepts by integrating modern technologies and assessing the possibility of advancing HrT in this direction.