In this paper, we propose a method for extracting the structure of an indoor environment using radar. When using the radar in an indoor environment, ghost targets are observed through the multipath propagation of radio waves. The presence of these ghost targets obstructs accurate mapping in the indoor environment, consequently hindering the extraction of the indoor environment. Therefore, we propose a deep learning-based method that uses image-to-image translation to extract the structure of the indoor environment by removing ghost targets from the indoor environment map. In this paper, the proposed method employs a conditional generative adversarial network (CGAN), which includes a U-Net-based generator and a patch-generative adversarial network-based discriminator. By repeating the process of determining whether the structure of the generated indoor environment is real or fake, CGAN ultimately returns a structure similar to the real environment. First, we generate a map of the indoor environment using radar, which includes ghost targets. Next, the structure of the indoor environment is extracted from the map using the proposed method. Then, we compare the proposed method, which is based on the structural similarity index and structural content, with the k-nearest neighbors algorithm, Hough transform, and density-based spatial clustering of applications with noise-based environment extraction method. When comparing the methods, our proposed method offers the advantage of extracting a more accurate environment without requiring parameter adjustments, even when the environment is changed.