This study examines the impact of sensor placement and multimodal sensor fusion on the performance of a Long Short-Term Memory (LSTM)-based model for human activity classification taking place in an agricultural harvesting scenario involving human-robot collaboration. Data were collected from twenty participants performing six distinct activities using five wearable inertial measurement units placed at various anatomical locations. The signals collected from the sensors were first processed to eliminate noise and then input into an LSTM neural network for recognizing features in sequential time-dependent data. Results indicated that the chest-mounted sensor provided the highest F1-score of 0.939, representing superior performance over other placements and combinations of them. Moreover, the magnetometer surpassed the accelerometer and gyroscope, highlighting its superior ability to capture crucial orientation and motion data related to the investigated activities. However, multimodal fusion of accelerometer, gyroscope, and magnetometer data showed the benefit of integrating data from different sensor types to improve classification accuracy. The study emphasizes the effectiveness of strategic sensor placement and fusion in optimizing human activity recognition, thus minimizing data requirements and computational expenses, and resulting in a cost-optimal system configuration. Overall, this research contributes to the development of more intelligent, safe, cost-effective adaptive synergistic systems that can be integrated into a variety of applications.