We develop a data-driven, statistical control method for autonomous excavators. Interactions between soil and an excavator bucket are highly complex and nonlinear, making traditional physical modeling difficult to use for realtime control. Here, we propose a data-driven method, exploiting data obtained from laboratory tests. We use the data to construct a nonlinear, non-parametric statistical model for predicting the behavior of soil scooped by an excavator bucket. The prediction model is built for controlling the amount of soil collected with a bucket. An excavator collects soil by dragging the bucket along the soil surface and scooping the soil by rotating the bucket. It is important to switch from the drag phase to the scoop phase with the correct timing to ensure an appropriate amount of soil has accumulated in front of the bucket. We model the process as a heteroscedastic Gaussian process based on the observation that the variance of the collected soil mass depends on the scooping trajectory, i.e. the input, as well as the shape of the soil surface immediately prior to scooping. We develop an optimal control algorithm for switching from the drag phase to the scoop phase at an appropriate time and for generating a scoop trajectory to capture a desired amount of soil with high confidence. We implement the method on a robotic excavator. Experiments show promising results. Index Terms-Data-driven control, Gaussian process, robotics in construction, mining robotics, field robots, model learning for control, optimization and optimal control, probability and statistical methods