In building-scale VR, where the entire interior of a large-scale building is a virtual space that users can walk around in, it is very important to handle movable objects that actually exist in the real world and not in the virtual space. We propose a mechanism to dynamically detect such objects (that are not embedded in the virtual space) in advance, and then generate a sound when one is hit with a virtual stick. Moreover, in a large indoor virtual environment, there may be multiple users at the same time, and their presence may be perceived by hearing, as well as by sight, e.g., by hearing sounds such as footsteps. We, therefore, use a GAN deep learning generation system to generate the impact sound from any object. First, in order to visually display a real-world object in virtual space, its 3D data is generated using an RGB-D camera and saved, along with its position information. At the same time, we take the image of the object and break it down into parts, estimate its material, generate the sound, and associate the sound with that part. When a VR user hits the object virtually (e.g., hits it with a virtual stick), a sound is generated. We demonstrate that users can judge the material from the sound, thus confirming the effectiveness of the proposed method.