Spatiotemporal data fusion is a key technique for generating unified time-series images from various satellite platforms to support the mapping and monitoring of vegetation. However, the high similarity in the reflectance spectrum of different vegetation types brings an enormous challenge in the similar pixel selection procedure of spatiotemporal data fusion, which may lead to considerable uncertainties in the fusion. Here, we propose an object-based spatiotemporal data-fusion framework to replace the original similar pixel selection procedure with an object-restricted method to address this issue. The proposed framework can be applied to any spatiotemporal data-fusion algorithm based on similar pixels. In this study, we modified the spatial and temporal adaptive reflectance fusion model (STARFM), the enhanced spatial and temporal adaptive reflectance fusion model (ESTARFM) and the flexible spatiotemporal data-fusion model (FSDAF) using the proposed framework, and evaluated their performances in fusing Sentinel 2 and Landsat 8 images, Landsat 8 and Moderate-resolution Imaging Spectroradiometer (MODIS) images, and Sentinel 2 and MODIS images in a study site covered by grasslands, croplands, coniferous forests, and broadleaf forests. The results show that the proposed object-based framework can improve all three data-fusion algorithms significantly by delineating vegetation boundaries more clearly, and the improvements on FSDAF is the greatest among all three algorithms, which has an average decrease of 2.8% in relative root-mean-square error (rRMSE) in all sensor combinations. Moreover, the improvement on fusing Sentinel 2 and Landsat 8 images is more significant (an average decrease of 2.5% in rRMSE). By using the fused images generated from the proposed object-based framework, we can improve the vegetation mapping result by significantly reducing the “pepper-salt” effect. We believe that the proposed object-based framework has great potential to be used in generating time-series high-resolution remote-sensing data for vegetation mapping applications.