Automatic object removal with obstructed façades completion in the urban environment is essential for many applications such as scene restoration, environmental impact assessment, and urban mapping. However, the previous object removal typically requires a user to manually create a mask around unwanted objects and obtain background façade information in advance, which would be labor-intensive when implementing multitasking projects. Moreover, accurately detecting objects to be removed in the cityscape and inpainting the static obstructed building façade to obtain plausible images are the main challenges for this objective. To overcome these difficulties, this study addresses the object removal with the façade inpainting problem from the following two aspects. First, we proposed an image-based cityscape elimination method for automatic object removal and façade inpainting by applying semantic segmentation to detect several classes, including pedestrians, riders, vegetation, and cars, as well as using generative adversarial networks (GANs) for filling detected regions by background textures and patching information from street-level imagery. Second, we proposed a workflow to filter unoccluded building façades from street view images automatically and tailored a dataset for the GAN-based image inpainting model with original and mask images. Furthermore, several full-reference image quality assessment (IQA) metrics are introduced to evaluate the generated image quality. Validation results demonstrated the feasibility and effectiveness of our proposed method, and the synthetic image is visually realistic and semantically consistent.INDEX TERMS Generative adversarial networks, semantic segmentation, automatic object removal, façade inpainting, street view images.