Accurate capacity estimation is important for safe operation of battery. Existing advanced researches heavily rely on feature engineering to model capacity degradation, where features are difficult to design and extract. In this paper, a novel purely data-driven capacity estimation method is proposed by applying ensemble learning to extremely sparse significant points on voltage and/or temperature curve. The significant points are just raw points evenly distributed throughout the charging/discharging process, or points corresponding to specific SoC, which is easy to extract without complicated feature engineering process on raw data. A novel ensemble learning framework incorporating light gradient boosting decision tree (LightGBM) and neural network is employed to find the regression relationship between significant points and battery capacity. Public battery dataset collected by Oxford university is used to verify the effectiveness of the proposed method. Results show that for Oxford dataset, the maximum and mean capacity estimation error could be controlled within 1.25% and 0.5% respectively, which is superior to most of existing capacity estimation methods. Moreover, robustness, generalization ability and model application are well discussed.INDEX TERMS Capacity estimation; ensemble learning; Data-driven; Significant points.