Sound perception is a fundamental skill for many people with severe sight impairments. The research presented in this paper is part of an ongoing project with the aim to create a mobile guidance aid to help people with vision impairments find objects within an unknown indoor environment. This system requires an effective non-visual interface and uses bone-conduction headphones to transmit audio instructions to the user. It has been implemented and tested with spatialised audio cues, which convey the direction of a predefined target in 3D space. We present an in-depth evaluation of the audio interface with several experiments that involve a large number of participants, both blindfolded and with actual visual impairments, and analyse the pros and cons of our design choices. In addition to producing results comparable to the state-of-the-art, we found that Fitts's Law (a predictive model for human movement) provides a suitable a metric that can be used to improve and refine the quality of the audio interface in future mobile navigation aids. CCS Concepts: • Human-centered computing → HCI theory, concepts and models; Laboratory experiments; Sound-based input / output; Pointing devices; Empirical studies in accessibility.