2019
DOI: 10.1101/811661
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Low-dimensional learned feature spaces quantify individual and group differences in vocal repertoires

Abstract: Vocalization is an essential medium for social and sexual signaling in most birds and mammals. 1 Consequently, the analysis of vocal behavior is of great interest to fields such as neuroscience and 2 linguistics. A standard approach to analyzing vocalization involves segmenting the sound stream 3 into discrete vocal elements, calculating a number of handpicked acoustic features, and then using 4 the feature values for subsequent quantitative analysis. While this approach has proven powerful, 5 it suffers from … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
46
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(46 citation statements)
references
References 46 publications
0
46
0
Order By: Relevance
“…Given the artificial nature of optogenetic stimulation, we wondered whether USVs elicited by 167 optogenetic activation of POA neurons were acoustically similar to the USVs that are normally 168 produced by mice during social interactions. To compare the acoustic features of 169 optogenetically-elicited USVs (opto-USVs) to those of USVs produced spontaneously to a 170 nearby female, we employed a recently described method using variational autoencoders (VAEs) (Goffinet, 2019;Sainburg et al, 2019). Briefly, the VAE is an unsupervised modeling 172 approach that uses spectrograms of vocalizations as inputs and from these data learns a pair of 173 probabilistic maps, an "encoder" and a "decoder," capable of compressing vocalizations into a 174 small number of latent features while attempting to preserve as much information as possible 175 ( Fig.…”
Section: Acoustic Characterization Of Usvs Elicited By Activation Of mentioning
confidence: 99%
See 2 more Smart Citations
“…Given the artificial nature of optogenetic stimulation, we wondered whether USVs elicited by 167 optogenetic activation of POA neurons were acoustically similar to the USVs that are normally 168 produced by mice during social interactions. To compare the acoustic features of 169 optogenetically-elicited USVs (opto-USVs) to those of USVs produced spontaneously to a 170 nearby female, we employed a recently described method using variational autoencoders (VAEs) (Goffinet, 2019;Sainburg et al, 2019). Briefly, the VAE is an unsupervised modeling 172 approach that uses spectrograms of vocalizations as inputs and from these data learns a pair of 173 probabilistic maps, an "encoder" and a "decoder," capable of compressing vocalizations into a 174 small number of latent features while attempting to preserve as much information as possible 175 ( Fig.…”
Section: Acoustic Characterization Of Usvs Elicited By Activation Of mentioning
confidence: 99%
“…3C, right). To quantify the difference between female-directed and opto-USVs for 184 each mouse, we estimated the Maximum Mean Discrepancy (Gretton, 2012) between 185 distributions of latent syllable representations as in (Goffinet, 2019). In addition, a baseline level 186 of variability in syllable repertoire was established for each mouse by estimating the MMD 187 between the first and second halves of female-directed USVs emitted in a recording session 188 unusual opto-USVs tended to be very loud and also had high frequency bandwidth.…”
Section: Acoustic Characterization Of Usvs Elicited By Activation Of mentioning
confidence: 99%
See 1 more Smart Citation
“…The utility of non-linear dimensionality reduction techniques are just now coming to fruition in the study of animal communication, for example using t-distributed stochastic neighborhood embedding (t-SNE; [ 32 ]) to describe the development of zebra finch song [ 34 ], using Uniform Manifold Approximation and Projection (UMAP; [ 31 ]) to describe and infer categories in birdsong [ 3 , 35 ], or using deep neural networks to synthesize naturalistic acoustic stimuli [ 36 , 37 ]. Developments in non-linear representation learning have helped fuel the most recent advancements in machine learning, untangling statistical relationships in ways that provide more explanatory power over data than traditional linear techniques [ 13 , 14 ].…”
Section: Introductionmentioning
confidence: 99%
“…Aspects of other deep networks applied to animal motor control may improve TweetyNet. Examples include object detection architectures [47,48] applied to mouse ultrasonic vocalizations and animal motion tracking, and generative architectures applied to birdsong and other vocalizations [49][50][51]. Lastly we note that in principle TweetyNet and vak library can be applied to any other annotated vocalization, including calls of bats, mouse ultrasonic vocalizations, and dolphin communication.…”
Section: Discussionmentioning
confidence: 99%