The goal of the See ColOr project is to achieve a noninvasive mobility aid for blind users that will use the auditory pathway to represent in real-time frontal image scenes. We present and discuss here two image processing methods that were experimented in this work: image simplification by means of segmentation, and guiding the focus of attention through the computation of visual saliency. A mean shift segmentation technique gave the best results, but for real-time constraints we simply implemented an image quantification method based on the HSL colour system. More particularly, we have developed two prototypes which transform HSL coloured pixels into spatialised classical instrument sounds lasting for 300 ms. Hue is sonified by the timbre of a musical instrument, saturation is one of four possible notes, and luminosity is represented by bass when luminosity is rather dark and singing voice when it is relatively bright. The first prototype is devoted to static images on the computer screen, while the second has been built up on a stereoscopic camera which estimates depth by triangulation. In the audio encoding, distance to objects was quantified into four duration levels. Six participants with their eyes covered by a dark tissue were trained to associate colours with musical instruments and then asked to determine on several pictures, objects with specific shapes and colours. In order to simplify the protocol of experiments, we used a tactile tablet, which took the place of the camera. Overall, colour was helpful for the interpretation of image scenes. Moreover, preliminary results with the second prototype consisting in the recognition of coloured balloons were very encouraging. Image processing techniques such as saliency could accelerate in the future the interpretation of sonified image scenes.
Abstract-In the context of vision substitution by the auditory channel several systems have been introduced. One such system that is presented here, See ColOr, is a dedicated interface part of a mobility aid for visually impaired people. It transforms a small portion of a colored video image into spatialized instrument sounds. In this work the purpose is to verify the hypothesis that sounds from musical instruments provide an alternative way to vision for obtaining color information from the environment. We introduce an experiment in which several participants try to match pairs of colored socks by pointing a head mounted camera and by listening to the generated sounds. Our experiments demonstrated that blindfolded individuals were able to accurately match pairs of colored socks. The advantage of the See ColOr interface is that it allows the user to receive a feedback auditory signal from the environment and its colors, promptly. Our perceptual auditory coding of pixel values opens the opportunity to achieve more complicated experiments related to vision tasks, such as perceiving the environment by interpreting its colors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.