Sensory Substitution Devices (SSDs) convey visual information through audition or touch, targeting blind and visually impaired individuals. One bottleneck towards adopting SSDs in everyday life by blind users, is the constant dependency on sighted instructors throughout the learning process. Here, we present a proof-of-concept for the efficacy of an online self-training program developed for learning the basics of the EyeMusic visual-to-auditory SSD tested on sighted blindfolded participants. Additionally, aiming to identify the best training strategy to be later re-adapted for the blind, we compared multisensory vs. unisensory as well as perceptual vs. descriptive feedback approaches. To these aims, sighted participants performed identical SSD-stimuli identification tests before and after ~75 minutes of self-training on the EyeMusic algorithm. Participants were divided into five groups, differing by the feedback delivered during training: auditory-descriptive, audio-visual textual description, audio-visual perceptual simultaneous and interleaved, and a control group which had no training. At baseline, before any EyeMusic training, participants SSD objects’ identification was significantly above chance, highlighting the algorithm’s intuitiveness. Furthermore, self-training led to a significant improvement in accuracy between pre- and post-training tests in each of the four feedback groups versus control, though no significant difference emerged among those groups. Nonetheless, significant correlations between individual post-training success rates and various learning measures acquired during training, suggest a trend for an advantage of multisensory vs. unisensory feedback strategies, while no trend emerged for perceptual vs. descriptive strategies. The success at baseline strengthens the conclusion that cross-modal correspondences facilitate learning, given SSD algorithms are based on such correspondences. Additionally, and crucially, the results highlight the feasibility of self-training for the first stages of SSD learning, and suggest that for these initial stages, unisensory training, easily implemented also for blind and visually impaired individuals, may suffice. Together, these findings will potentially boost the use of SSDs for rehabilitation.