Urban forests are increasingly recognized as vital components of urban ecosystems, offering a plethora of physiological and psychological benefits to residents. However, the existing research has often focused on single dimensions of either visual or auditory experiences, overlooking the combined impact of audio–visual environments on public health and well-being. This study addresses this gap by examining the effects of composite audio–visual settings within three distinct types of urban forests in Fuzhou, China: mountain, mountain–water, and waterfront forests. Through field surveys and quantitative analysis at 24 sample sites, we assessed visual landscape elements, soundscapes, physiological indicators (e.g., heart rate, skin conductance), and psychological responses (e.g., spiritual vitality, stress relief, emotional arousal, attention recovery) among 77 participants. Our findings reveal that different forest types exert varying influences on visitors’ physiology and psychology, with waterfront forests generally promoting relaxation and mountain–water forests inducing a higher degree of tension. Specific audio–visual elements, such as plant, water scenes, and natural sounds, positively affect psychological restoration, whereas urban noise is associated with increased physiological stress indicators. In conclusion, the integrated effects of audio–visual landscapes significantly shape the multisensory experiences of the public in urban forests, underscoring the importance of optimal design that incorporates natural elements to create restorative environments beneficial to the health and well-being of urban residents. These insights not only contribute to the scientific understanding of urban forest impact but also inform the design and management of urban green spaces for enhanced public health outcomes.