Harvesting in soft-fruit farms is labor intensive, time consuming and is severely affected by scarcity of skilled labors. Among several activities during soft-fruit harvesting, human pickers take 20–30% of overall operation time into the logistics activities. Such an unproductive time, for example, can be reduced by optimally deploying a fleet of agricultural robots and schedule them by anticipating the human activity behaviour (state) during harvesting. In this paper, we propose a framework for spatio-temporal prediction of human pickers’ activities while they are picking fruits in agriculture fields. Here we exploit temporal patterns of picking operation and 2D discrete points, called topological nodes, as spatial constraints imposed by the agricultural environment. Both information are used in the prediction framework in combination with a variant of the Hidden Markov Model (HMM) algorithm to create two modules. The proposed methodology is validated with two test cases. In Test Case 1, the first module selects an optimal temporal model called as picking_state_progression model that uses temporal features of a picker state (event) to statistically evaluate an adequate number of intra-states also called sub-states. In Test Case 2, the second module uses the outcome from the optimal temporal model in the subsequent spatial model called node_transition model and performs “spatio-temporal predictions” of the picker’s movement while the picker is in a particular state. The Discrete Event Simulation (DES) framework, a proven agricultural multi-robot logistics model, is used to simulate the different picking operation scenarios with and without our proposed prediction framework and the results are then statistically compared to each other. Our prediction framework can reduce the so-called unproductive logistics time in a fully manual harvesting process by about 80 percent in the overall picking operation. This research also indicates that the different rates of picking operations involve different numbers of sub-states, and these sub-states are associated with different trends considered in spatio-temporal predictions.