Sensory substitution is a promising technique for mitigating the loss of a sensory modality. Sensory substitution devices (SSDs) work by converting information from the impaired sense (e.g., vision) into another, intact sense (e.g., audition). However, there are a potentially infinite number of ways of converting images into sounds, and it is important that the conversion takes into account the limits of human perception and other user-related factors (e.g., whether the sounds are pleasant to listen to). The device explored here is termed “polyglot” because it generates a very large set of solutions. Specifically, we adapt a procedure that has been in widespread use in the design of technology but has rarely been used as a tool to explore perception—namely, interactive genetic algorithms. In this procedure, a very large range of potential sensory substitution devices can be explored by creating a set of “genes” with different allelic variants (e.g., different ways of translating luminance into loudness). The most successful devices are then “bred” together, and we statistically explore the characteristics of the selected-for traits after multiple generations. The aim of the present study is to produce design guidelines for a better SSD. In three experiments, we vary the way that the fitness of the device is computed: by asking the user to rate the auditory aesthetics of different devices (Experiment 1), and by measuring the ability of participants to match sounds to images (Experiment 2) and the ability to perceptually discriminate between two sounds derived from similar images (Experiment 3). In each case, the traits selected for by the genetic algorithm represent the ideal SSD for that task. Taken together, these traits can guide the design of a better SSD.