A U-Net-based network has achieved competitive performance in retinal vessel segmentation. Previous work has focused on using multilevel high-level features to improve segmentation accuracy but has ignored the importance of shallow-level features. In addition, multiple upsampling and convolution operations may destroy the semantic feature information contained in the decoder layer. To address these problems, we propose a scale and feature aggregate network (SFA-Net), which can make full use of multiscale high-level feature information and shallow features. In this paper, a residual atrous spatial feature aggregate block (RASF) is embedded at the end of the encoder to learn multiscale information. Furthermore, an attentional feature module (AFF) is proposed to enhance the effective fusion between shallow and high-level features. In addition, we designed the multi-path feature fusion (MPF) block to fuse high-level features of different decoder layers, which aims to learn the relationship between the high-level features of different paths and alleviate the information loss. We apply the network to the three benchmark datasets (DRIVE, STARE, and CHASE_DB1) and compare them with the other current state-of-the-art methods. The experimental results demonstrated that the proposed SFA-Net performs effectively, indicating that the network is suitable for processing some complex medical images.