Traditionally, the recognition of sound mainly focuses on the source of sound, such as level and quality. Now, the sound, the environment, and the listeners have begun to study the landscape structure, composition, and characteristics of the acoustic environment. The purpose of this paper is to study the simulation design of virtual reality voice landscape quantification based on cloud computing. Firstly, the definition and characteristics of cloud computing are described, and the key technologies of cloud computing are analyzed. Combined with the basic principles of technology selection, the virtualization technology is emphatically analyzed. By selecting 7 acoustic elements, such as traffic sound, water flow sound, fountain sound, birdsong, wind sound, rippling sound, beach sound, and seabird sound, the possible acoustic elements in a given park environment are simulated for subjective evaluation. The experimental results show that when the traffic sound is 60 dB, the evaluation result of the superimposed sound type is the same as that when the traffic sound is 50 dB. For the superimposed sound level, 30 dB and 40 dB are significantly different from 60 dB and 70 dB, respectively, 50 dB is only significantly different from 70 dB, while 60 dB is only not significantly different from 50 dB, and 70 dB evaluation is significantly different from each sound level. However, 60 dB can be regarded as the turning point of the evaluation result. When the sound level of the added sound is greater than 60 dB, the evaluation result is obviously worse.