Benchmarking nearest neighbor retrieval of zebra finch vocalizations across development

Tomka, Tomas; Hao, Xinyu; Miao, Aoxue; Lee, Kanghwi; Basha, Maris; Reimann, Stefan; Zai, Anja T.; Hahnloser, Richard H. R.

doi:10.1101/2023.09.04.555475

2023

DOI: 10.1101/2023.09.04.555475

|View full text |Cite

Preprint

Benchmarking nearest neighbor retrieval of zebra finch vocalizations across development

Tomas Tomka,

Xinyu Hao,

Aoxue Miao

et al.

Abstract: Vocalizations are highly specialized motor gestures that regulate social interactions. The reliable detection of vocalizations from raw streams of microphone data remains an open problem even in research on widely studied animals such as the zebra finch. A promising method for finding vocal samples from potentially few labelled examples (templates) is nearest neighbor retrieval, but this method has never been extensively tested on vocal segmentation tasks. We retrieve zebra finch vocalizations as neighbors of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Preprint1

Relationship

Self Cite1

Independent0

Authors

Journals

Cited by 1 publication

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

Gu,

Lee,

Basha

et al. 2023

Preprint

Self Cite

View full text Add to dashboard Cite

This paper introduces WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for human and animal Voice Activity Detection (VAD). Contrary to traditional methods that detect human voice or animal vocalizations from a short audio frame and rely on careful threshold selection, WhisperSeg processes entire spectrograms of long audio and generates plain text representations of onset, offset, and type of voice activity. Processing a longer audio context with a larger network greatly improves detection accuracy from few labeled examples. We further demonstrate a positive transfer of detection performance to new animal species, making our approach viable in the data-scarce multi-species setting.1

show abstract

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

Gu,

Lee,

Basha

et al. 2023

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Benchmarking nearest neighbor retrieval of zebra finch vocalizations across development

Cited by 1 publication

References 57 publications

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

Contact Info

Product

Resources

About