2023
DOI: 10.3389/fbinf.2022.966066
|View full text |Cite
|
Sign up to set email alerts
|

Interactive extraction of diverse vocal units from a planar embedding without the need for prior sound segmentation

Abstract: Annotating and proofreading data sets of complex natural behaviors such as vocalizations are tedious tasks because instances of a given behavior need to be correctly segmented from background noise and must be classified with minimal false positive error rate. Low-dimensional embeddings have proven very useful for this task because they can provide a visual overview of a data set in which distinct behaviors appear in different clusters. However, low-dimensional embeddings introduce errors because they fail to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 18 publications
0
5
0
Order By: Relevance
“…While there are many methods for single channel segmentation available [38][39][40] , only few combine information from multiple sensor channels 40 and none has made use of accelerometer data so far. Also, it remains open how well vocalizations can be separated from the ve stationary microphones alone.…”
Section: Discussionmentioning
confidence: 99%
“…While there are many methods for single channel segmentation available [38][39][40] , only few combine information from multiple sensor channels 40 and none has made use of accelerometer data so far. Also, it remains open how well vocalizations can be separated from the ve stationary microphones alone.…”
Section: Discussionmentioning
confidence: 99%
“…To enable automatic segmentation of longitudinal BirdPark data, new methods will need to be developed. While there are many methods for single channel segmentation available [38][39][40] , only few combine information from multiple sensor channels 40 and none has made use of accelerometer data so far.…”
Section: Discussionmentioning
confidence: 99%
“…We annotated vocal segments (not further classified into vocalization types) with high temporal accuracy. To generate these gold-standard (GS) annotations, we used a semi-supervised segmentation method 13 , correcting poor segments and eliminating false positives by visual inspection of spectrograms. To eliminate false negatives, the present NN method was used with the cosine distance as metric.…”
Section: Methodsmentioning
confidence: 99%
“…Entirely lacking are public datasets of precisely segmented subsongs; a recent massive-data study on this important developmental phase 12 simply ignores the segmentation problem and takes as proxy of vocalizations all amplitude-thresholded sound segments, semi-automatically excluding false positives in such a way to introduce false negatives (see Appendix). Unfortunately, amplitude thresholding can create severe problems if the recording quality is low 13 , which only emphasizes that this severe lack of training and test data forms a bottleneck for progress in large-scale research on vocal development, and it calls for the creation of gold-standard data sets.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation