Rice is one of the world’s major staple foods, especially in China. Highly accurate monitoring on rice-producing land is, therefore, crucial for assessing food supplies and productivity. Recently, the deep-learning convolutional neural network (CNN) has achieved considerable success in remote-sensing data analysis. A CNN-based paddy-rice mapping method using the multitemporal Landsat 8, phenology data, and land-surface temperature (LST) was developed during this study. First, the spatial–temporal adaptive reflectance fusion model (STARFM) was used to blend the moderate-resolution imaging spectroradiometer (MODIS) and Landsat data for obtaining multitemporal Landsat-like data. Subsequently, the threshold method is applied to derive the phenological variables from the Landsat-like (Normalized difference vegetation index) NDVI time series. Then, a generalized single-channel algorithm was employed to derive LST from the Landsat 8. Finally, multitemporal Landsat 8 spectral images, combined with phenology and LST data, were employed to extract paddy-rice information using a patch-based deep-learning CNN algorithm. The results show that the proposed method achieved an overall accuracy of 97.06% and a Kappa coefficient of 0.91, which are 6.43% and 0.07 higher than that of the support vector machine method, and 7.68% and 0.09 higher than that of the random forest method, respectively. Moreover, the Landsat-derived rice area is strongly correlated (R2 = 0.9945) with government statistical data, demonstrating that the proposed method has potential in large-scale paddy-rice mapping using moderate spatial resolution images.