In response to the issues of low outdoor thermal comfort and poor ventilation environment in Beijing Hutong, this paper proposes a rapid intelligent optimization method combining Pix2Pix (Image-to-Image Translation with Conditional Adversarial Networks) with a genetic algorithm. Firstly, the architectural types of the research objects are highly refined and summarized into four traditional building types. Then, they are placed in the site with open spaces in a certain proportion, and a multi-objective optimization model for the UTCI (Universal Thermal Climate Index) and building area is constructed using a genetic algorithm, generating and iteratively optimizing the spatial layout of the building population. Finally, Pix2Pix is used to learn and train a large number of Hutong combination samples, rapidly generating the UTCI and ventilation results, which serve as the optimization objectives to obtain the optimal solution set for Hutong spatial forms. Compared with traditional empirical design methods, this method allows for a rapid and efficient traversal of vast solution spaces, intelligently generating Hutong renovation schemes that balance cultural heritage and healthy comfort. The research results demonstrate that this method can quickly find (26.4 times faster than traditional performance simulation methods) that the reasonable proportions of Siheyuan, Sanheyuan, Erheyuan, new buildings, and empty spaces in the Da Yuan Hutong in Beijing should be controlled at 11.8%, 16.9%, 23.8%, 33.8%, and 13.7%, respectively. Meanwhile, the building density should be maintained between 0.5 and 0.58, and the floor area ratio should be kept between 0.96 and 1.14. This significantly improves outdoor comfort, enhances the living environment of the Hutong, and promotes sustainable urban development.