2019
DOI: 10.3390/app9030439
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Segmentation of Ethnomusicological Field Recordings

Abstract: The article presents a method for segmentation of ethnomusicological field recordings. Field recordings are integral documents of folk music performances captured in the field, and typically contain performances, intertwined with interviews and commentaries. As these are live recordings, captured in non-ideal conditions, they usually contain significant background noise. We present a segmentation method that segments field recordings into individual units labelled as speech, solo singing, choir singing, and in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 8 publications
0
6
0
Order By: Relevance
“…Thus, we believe that the best source of scales is in ethnographic recordings spanning the past century [71], so it is imperative that methods be developed that can faithfully infer scales from large samples of songs. Algorithms must be developed that can handle low-quality recordings [144], background noise, instrument / singing segmentation [145], polyphonic stream segmentation [146], note segmentation [147], and tonal drift [148].…”
Section: Limitations To Studying Scale Evolutionmentioning
confidence: 99%
“…Thus, we believe that the best source of scales is in ethnographic recordings spanning the past century [71], so it is imperative that methods be developed that can faithfully infer scales from large samples of songs. Algorithms must be developed that can handle low-quality recordings [144], background noise, instrument / singing segmentation [145], polyphonic stream segmentation [146], note segmentation [147], and tonal drift [148].…”
Section: Limitations To Studying Scale Evolutionmentioning
confidence: 99%
“…For an overview of the breadth of the field and types of analyses performed see Meredith (2016). For ethnomusicological field recordings a number of algorithms are currently available (e.g., Bozkurt et al (2014), Marolt et al (2019); see Panteli et al (2018) for an overview). Nevertheless, automatic transcription of polyphonic music recordings still needs improvement (Holzapfel et al, 2019).…”
Section: Music Information Retrievalmentioning
confidence: 99%
“…As an example of how to integrate such a service, we configured the Dataverse repository so that for each uploaded audio file, the service is called and the generated annotation file is automatically stored along with the original data. As a proof of concept, a segmentation of speech and music was implemented, with the original settings provided by Marolt et al (2019), that achieves an F1 measure up to 0.63. Considering this moderate segmentation accuracy, the need to support manual correction of the annotations by ethnomusicology experts becomes evident in order to verify the quality of the data, which at the same time creates new ground truth datasets that may inform future analysis algorithms.…”
Section: Music Information Retrieval To Automate Tasksmentioning
confidence: 99%
“…In a different work, the same authors used techniques from the string matching literature to identify segments in recordings on a frame-level similarity basis [138]. From a probabilistic perspective, Marlot studied a similar type of recordings made by amateur folk musicians, and trained a probabilistic model to segment them into phrases [139]. In Marlot's approach, the signal is first partitioned into fragments that are classified into one of the following categories: speech, solo singing, choir singing, and instrumental music.…”
Section: Musical Skill Acquisition Tasksmentioning
confidence: 99%
“…The MFCC is typically constructed using successive temporal windows, thus representing auditory information over time. These coefficients were first used in speech recognition [247], and over the past two decades were shown to be extremely useful in music analysis, serving as a condensed but expressive representation of spectrum over time (see [248,249,250,251,252] for a few examples).…”
Section: Audio Representations and Derived Featuresmentioning
confidence: 99%