Unlike many species, song learning birds and humans have independently evolved the ability to communicate via learned vocalizations. Both birdsong and spoken language are culturally transmitted across generations, within species-specific constraints that leave room for considerable variation. We review the commonalities and differences between vocal learning bird species and humans, across behavioral, developmental, neuroanatomical, physiological, and genetic levels. We propose that cultural transmission of vocal repertoires is a natural consequence of the evolution of vocal learning and that at least some species-specific universals, as well as species differences in cultural transmission, are due to differences in vocal learning phenotypes, which are shaped by genetic constraints. We suggest that it is the balance between these constraints and features of the social environment that allows cultural learning to propagate and describe new opportunities for exploring meaningful comparisons of birdsong and human vocal culture that focus on the ontogeny of vocal interactivity. Expected final online publication date for the Annual Review of Linguistics, Volume 7 is January 14, 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.