Photosynthesis is the key physiological activity in the process of crop growth and plays an irreplaceable role in carbon assimilation and yield formation. This study extracted rice (Oryza sativa L.) canopy reflectance based on the UAV multispectral images and analyzed the correlation between 25 vegetation indices (VIs), three textural indices (TIs), and net photosynthetic rate (Pn) at different growth stages. Linear regression (LR), support vector regression (SVR), gradient boosting decision tree (GBDT), random forest (RF), and multilayer perceptron neural network (MLP) models were employed for Pn estimation, and the modeling accuracy was compared under the input condition of VIs, VIs combined with TIs, and fusion of VIs and TIs with plant height (PH) and SPAD. The results showed that VIs and TIs generally had the relatively best correlation with Pn at the jointing–booting stage and the number of VIs with significant correlation (p< 0.05) was the largest. Therefore, the employed models could achieve the highest overall accuracy [coefficient of determination (R2) of 0.383–0.938]. However, as the growth stage progressed, the correlation gradually weakened and resulted in accuracy decrease (R2 of 0.258–0.928 and 0.125–0.863 at the heading–flowering and ripening stages, respectively). Among the tested models, GBDT and RF models could attain the best performance based on only VIs input (with R2 ranging from 0.863 to 0.938 and from 0.815 to 0.872, respectively). Furthermore, the fusion input of VIs, TIs with PH, and SPAD could more effectively improve the model accuracy (R2 increased by 0.049–0.249, 0.063–0.470, and 0.113–0.471, respectively, for three growth stages) compared with the input combination of VIs and TIs (R2 increased by 0.015–0.090, 0.001–0.139, and 0.023–0.114). Therefore, the GBDT and RF model with fused input could be highly recommended for rice Pn estimation and the methods could also provide reference for Pn monitoring and further yield prediction at field scale.