Large-scale hyperspectral image (HSI) clustering remains a fundamental and challenging task due to tremendous spatial scales, abundant spectral band information, and lack of prior information. Most existing clustering methods either ignore the spectral band correlation leading to low clustering performance nor unprocessable due to the large spatial scale. To solve these difficulties, this paper presents a dual smooth graph convolutional clustering (DSGCC) framework for largescale HSI clustering. Specifically, the superpixel is introduced to decrease the spatial scale of HSI and reduce graph node number for subsequent network training. Furthermore, a smooth graph filter is presented, which extracts the smooth features and filters the high-frequency interference in graph learning. In addition, we propose a layer-wise graph reconstruction (LGR) mechanism, which constrains all hidden layers output by graph reconstruction loss. Finally, we introduce a self-training method that utilizes soft labels to supervise during clustering and learns the robust embedding for node clustering. DSGCC is an end-to-end network that is optimized by joint loss which is easily trained from scratch. We assess DSGCC on five commonly used large-scale HSI datasets, and experiments denote that DSGCC achieves the optimum clustering performance which is superior to existing HSI clustering approaches. On Salinas, Indian Pines, Pavia University, WHU-Hi-LongKou, and WHU-Hi-HongHu datasets, the clustering overall accuracy (OA) of DSGCC are 85.43%, 65.45%, 70.48%, 87.30%, and 77.34%, respectively.