Semantic segmentation of high-resolution aerial images is a concerning issue of remote sensing applications. To address the issues of intra-class heterogeneity and inter-class homogeneity, a novel end-to-end semantic segmentation network, namely Context and Semantic Enhanced High-Resolution Network (CSE-HRNet), is proposed in this paper. Two procedures are considered comprehensively, which are multi-scale contextual feature extractor and multi-level semantic feature producer. Nested Dilated Residual Block (NDRB) is designed firstly, which could enhance the representational power of multi-scale contexts and tackle the issue of intra-class heterogeneity. The pyramidal feature hierarchy is introduced secondly, by which multi-level feature fusions could be utilized to enlarge inter-class semantic differences. Experimental results verify that, based on the Potsdam and Vaihingen benchmarks, the proposed CSE-HRNet can achieve competitive performance compared with other state-of-the-art methods. INDEX TERMS Semantic segmentation, image analysis, machine learning, remote sensing image.