Musculoskeletal pain is one of the significant health issues faced by the Information Technology (IT) industries and health-care professional personnel. The current IT sector requires people working on sitting in one place for long hours (~3-4 hours). This causes severe hip, neck, and shoulder pain and may lead to paralysis. Convergence of a threedimensional (3D) image into a plane-based projection to precisely classify the trunk extension and flexion, wrist extension and flexion exercises posture images. Because the predictions in different planes are incorrectly detected during the convergence process, a deep learning algorithm is a superior technique for improving recognition accuracy and processing speed. 200 image datasets of the wrist, trunk extension, and flexion exercise posture are created at various planes. The proposed deep learning algorithm performance is compared with CNN with accelerometer sensing image data, DNN with RGB images, CNN-GRU with Kinect Depth images, Deep Hybrid CNN with body portion keyframe images, Spatial Transform Networks (STN) with attention-based multi-scale CNN with Grad CAM images. The observation demonstrates the efficiency of these systems in musculoskeletal rehabilitation therapy in that the suggested deep learning-based system successfully identifies the completion of rehabilitation activities with a recognition of training accuracy of 98.12% and validation accuracy of 95%. Our approach can track and enhance the efficiency of patients' rehabilitation training with greater satisfactory precision than some other cutting-edge conventional CNN-based baseline architecture.