Computer-aided automatic segmentation of retinal blood vessels plays an important role in the diagnosis of diseases such as diabetes, glaucoma, and macular degeneration. In this paper, we propose a multi-scale feature fusion retinal vessel segmentation model based on U-Net, named MSFFU-Net. The model introduces the inception structure into the multi-scale feature extraction encoder part, and the max-pooling index is applied during the upsampling process in the feature fusion decoder of an improved network. The skip layer connection is used to transfer each set of feature maps generated on the encoder path to the corresponding feature maps on the decoder path. Moreover, a cost-sensitive loss function based on the Dice coefficient and cross-entropy is designed. Four transformations—rotating, mirroring, shifting and cropping—are used as data augmentation strategies, and the CLAHE algorithm is applied to image preprocessing. The proposed framework is tested and trained on DRIVE and STARE, and sensitivity (Sen), specificity (Spe), accuracy (Acc), and area under curve (AUC) are adopted as the evaluation metrics. Detailed comparisons with U-Net model, at last, it verifies the effectiveness and robustness of the proposed model. The Sen of 0.7762 and 0.7721, Spe of 0.9835 and 0.9885, Acc of 0.9694 and 0.9537 and AUC value of 0.9790 and 0.9680 were achieved on DRIVE and STARE databases, respectively. Results are also compared to other state-of-the-art methods, demonstrating that the performance of the proposed method is superior to that of other methods and showing its competitive results.