Vision of human actions can affect several features of visual motion processing, as well as the motor responses of the observer. Here, we tested the hypothesis that action observation helps decoding environmental forces during the interception of a decelerating target within a brief time window, a task intrinsically very difficult. We employed a factorial design to evaluate the effects of scene orientation (normal or inverted) and target gravity (normal or inverted). Button-press triggered the motion of a bullet, a piston, or a human arm. We found that the timing errors were smaller for upright scenes irrespective of gravity direction in the Bullet group, while the errors were smaller for the standard condition of normal scene and gravity in the Piston group. In the Arm group, instead, performance was better when the directions of scene and target gravity were concordant, irrespective of whether both were upright or inverted. These results suggest that the default viewer-centered reference frame is used with inanimate scenes, such as those of the Bullet and Piston protocols. Instead, the presence of biological movements in animate scenes (as in the Arm protocol) may help processing target kinematics under the ecological conditions of coherence between scene and target gravity directions.