Abstract:Vocalizations are a widespread means of communication in the animal kingdom. Mice use a large repertoire of ultrasonic vocalizations (USVs) in different social contexts, for instance courtship, territorial dispute, dominance and mother-pup interaction. Previous studies have pointed to differences in the USVs in different context, sexes, strains and individuals, however, in many cases the outcomes of the analyses remained inconclusive.We here provide a more general approach to automatically classify USVs using deep neural networks (DNN). We classified the sex of the emitting mouse (C57Bl/6) based on the vocalization's spectrogram, reaching unprecedented performance (~84% correct) in comparison with other techniques (Support Vector Machines: 64%, Ridge regression: 52%). Vocalization characteristics of individual mice only contribute mildly, and sex-only classification reaches ~78%. The performance can only partially be explained by a set of classical shape features, with duration, volume and bandwidth being the most useful predictors. Splitting estimation into two DNNs, from spectrograms to features (57-82%) and features to sex (67%) does not reach the single-step performance.In summary, the emitter's sex can be successfully predicted from their spectrograms using DNNs, excelling over other classification techniques. In contrast to previous research, this suggests that male and female vocalizations differ in their spectrotemporal structure, recognizable even in single vocalizations.
1All rights reserved. No reuse allowed without permission.(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.The copyright holder for this preprint . http://dx.doi.org/10.1101/358143 doi: bioRxiv preprint first posted online Jun. 28, 2018; Introduction Identification of sex on the basis of sensory cues provides important information for successful reproduction. When listening to a conversation, humans can typically make an educated guess about the sexes of the participants. Limited research on this topic has identified multiple acoustic predictors, ranging from the fundamental frequency to formant measures (Pisanski et al. 2016) .Similar to humans, mice vocalize in particular during social interactions (Chabout et al. 2015;Heckman et al. 2016;Heckman et al. 2017;Neunuebel et al. 2015;Portfors and Perkel 2014) . The complexity of the vocalizations produced during social interactions can be substantial (Holy and Guo 2005) . While in humans and other species sex-specific differences in body dimensions (vocal tract length, vocal fold characteristics) lead to predictable differences in vocalization (Markova et al. 2016;Pfefferle and Fischer 2006) , the vocal tract properties of male and female have not been shown to differ significantly (Mahrt et al. 2016;Roberts 1975) . Hence, for mice the expected differences in male/female USVs are less predictable from physiological characteristics.Previous research on the properties of male and female ultrasonic vocalizat...