This study has developed a rigorous and efficient maximum likelihood method for estimating the parameters in stochastic energy balance models (with any k > 0 number of boxes) given time series of surface temperature and top-of-the-atmosphere net downward radiative flux. The method works by finding a state-space representation of the linear dynamic system and evaluating the likelihood recursively via the Kalman filter. Confidence intervals for estimated parameters are straightforward to construct in the maximum likelihood framework, and information criteria may be used to choose an optimal number of boxes for parsimonious k-box emulation of atmosphere-ocean general circulation models (AOGCMs). In addition to estimating model parameters the method enables hidden state estimation for the unobservable boxes corresponding to the deep ocean, and also enables noise filtering for observations of surface temperature. Feasibility, reliability and performance of the proposed method are demonstrated in a simulation study. In order to obtain a set of optimal k-box emulators, models are fitted to the 4×CO2 step responses of 16 AOGCMs in CMIP5. It is found that for all 16 AOGCMs three boxes are required for optimal k-box emulation. The number of boxes, k, is found to influence, sometimes strongly, the impulse responses of the fitted models.