“…Over the past two decades there has been a growing interest in the usage and signaling of vocalizations in mice, with large efforts going into studying the underlying neurobiological mechanisms for auditory processing (Pomerantz et al, 1983;Liu et al, 2003;Neilans et al, 2014;Perrodin et al, 2020;Holy and Guo, 2005), and the production of vocalizations (Arriaga et al, 2012;Chabout et al, 2016;Okobi et al, 2019;Zimmer et al, 2019;Gao et al, 2019;Tschida et al, 2019;Michael et al, 2020). The tools available for experiments in mice provide a promising model for studying the neural basis of vocalizations, as well as the effects of genes on the origin and development of vocal and neural anatomy (Grimsley et al, 2011;Bowers et al, 2013;Chabout et al, 2016;Tabler et al, The copyright holder for this preprint this version posted August 13, 2021. ; https://doi.org/10.1101/2021.08.13.456283 doi: bioRxiv preprint (Steinfath et al, 2021). DeepSS learns a representation of sounds features directly from raw audio recordings using temporal convolutional networks (TCNs), based on dilated convolutions.…”