Several pool-based active learning algorithms were employed to model potential energy surfaces (PESs) with a minimum number of electronic structure calculations. Among these algorithms, the class of uncertainty-based algorithms are popular. Their key principle is to query molecular structures corresponding to high uncertainties in their predictions. We empirically show that this strategy is not optimal for nonuniform data distributions as it collects many structures from sparsely sampled regions, which are less important to applications of the PES. We exploit a simple stochastic algorithm to correct for this behavior and implement it using regression trees, which have relatively small computational costs. We show that this algorithm requires around half the data to converge to the same accuracy than the uncertainty-based algorithm query-by-committee. Simulations on a 6D PES of pyrrole(H 2 O) show that < 15 000 configurations are enough to build a PES with a generalization error of 16 cm −1 , whereas the final model with around 50 000 configurations has a generalization error of 11 cm −1 .