This paper focuses on the unsupervised segmentation of images, which is an essential topic in the field of computer vision. In the absence of prior knowledge, it is challenging to generate semantic segmentation regions based on image content automatically. In this paper, we consider unsupervised image segmentation from the perspective of sub-region clustering and graph convolution. We over-segment the source image into disjoint sub-regions and generate multiscale representative maps for each sub-region. To explore the potential contextual correlation between different sub-regions, we build a graph model that establishes the dependency relationship and use a graph convolution network to transfer long-range contextual information. The segmentation results are obtained by combining sub-regions clustering and graph convolution training. We conduct extensive experiments on three image datasets, and the results show that the proposed method can provide consistent and meaningful segmentation results.