Vocal learning is usually studied in songbirds and humans, species that can form auditory templates by listening to acoustic models and then learn to vocalize to match the template. Most other species are thought to develop vocalizations without auditory feedback. However, auditory input influences the acoustic structure of vocalizations in a broad distribution of birds and mammals. Vocalizations are defined here as sounds generated by forcing air past vibrating membranes. A vocal motor program may generate vocalizations such as crying or laughter, but auditory feedback may be required for matching precise acoustic features of vocalizations. This chapter discriminates limited vocal learning, which uses auditory input to fine-tune acoustic features of an inherited auditory template, from complex vocal learning, in which novel sounds are learned by matching a learned auditory template. Two or three songbird taxa and four or five mammalian taxa are known for complex vocal learning. A broader range of mammals converge in the acoustic structure of vocalizations when in socially interacting groups, which qualifies as limited vocal learning. All birds and mammals tested use auditory-vocal feedback to adjust their vocalizations to compensate for the effects of noise, and many species modulate their signals as the costs and benefits of communicating vary. This chapter asks whether some auditory-vocal feedback may have provided neural substrates for the evolution of vocal learning. Progress will require more precise definitions of different forms of vocal learning, broad comparative review of their presence and absence, and behavioral and neurobiological investigations into the mechanisms underlying the skills.