Benefiting from free labeling pixel-level samples, weakly supervised semantic segmentation (WSSS) is making progress in automatically extracting building from high-resolution (HR) remote sensing (RS) imagery. For WSSS methods, generating high-quality pseudo-masks is crucial for accurate building extraction. To improve the performance of generating pseudo-masks by using imagelevel labels, this paper proposes a weakly supervised building extraction method by combining adversarial climbing and gated convolution (ACGC). The proposed method optimizes class activation maps (CAMs) by using adversarial climbing strategy, generates accurate class boundary maps (CBMs) by introducing a gated convolution module (GCM), and further refines building pseudo-masks by fusing pairing semantic affinities and CAMs with a random walk strategy. Experimental results on three datasets-two ISPRS datasets and a self-annotated dataset-demonstrate that the proposed approach outperformed SOTA WSSS methods, leading to improvement of building extraction from HR RS imagery.This study provides a new approach for optimizing pseudomasks generation, and a methodological reference for the applications of weakly supervised on RS images.