Echo-localization in visually impaired people plays a crucial role in obstacle avoidance as long as the obstacle -either an object or a person-makes a sufficiently audible sound. Otherwise, the obstacle will be invisible for the visually impaired person unless it could be detected via, e.g., a white cane. Artificial vision systems, combined with a sound-based device, have proven to be effective in enhancing independence and mobility of visually impaired users in daily tasks. In this work, we propose, build and test an interface that converts depth information acquired by a 3D vision system, into 3D sound using the head related transfer function (HRTF) transform. Thus our system registers the environment and reproduce the nearest objects to the user with a distinctive tone and volume, according to the distance, and its position and orientation from the vision systems point of view, in real time in the head of the person. In addition, our system can be benefited from the integration of previously developed approaches, such as objects, color and face recognition to thus further improve the quality of life of visually impaired people. We test our system in a population of seven volunteers, showing an encouraging exponential learning behaviour when facing two main issues: crossing doors and navigating in crowded environments.