Although creating a high-quality urban green space (UGS) is of considerable importance in public health, few studies have used individuals’ emotions to evaluate the UGS quality. This study aims to conduct a multidimensional emotional assessment method of UGS from the perspective of spatial quality. Panoramic videos of 15 scenes in the West Lake Scenic Area were displayed to 34 participants. For each scene, 12 attributes regarding spatial quality were quantified, including perceived plant attributes, spatial structure attributes, and experiences of UGS. Then, the Self-Assessment-Manikin (SAM) scale and face recognition model were used to measure people’s valence-arousal emotion values. Among all the predictors, the percentages of water and plants were the most predictive indicators of emotional responses measured by SAM scale, while the interpretation rate of the model measured by face recognition was insufficiently high. Concerning gender differences, women experienced a significantly higher valence than men. Higher percentages of water and plants, larger sizes, approximate shape index, and lower canopy densities were often related to positive emotions. Hence, designers must consider all structural attributes of green spaces, as well as enrich visual perception and provide various activities while creating a UGS. In addition, we suggest combining both physiological and psychological methods to assess emotional responses in future studies. Because the face recognition model can provide objective measurement of emotional responses, and the self-report questionnaire is much easier to administer and can be used as a supplement.