Probabilistic Segmentation of Folk Music Recordings

Bohak, Ciril; Marolt, Matija

doi:10.1155/2016/8297987

Cited by 4 publications

(4 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Segmentation and pitch drift are obtained with a method presented in [25]. We define segmentation as a set of boundaries between repeated segments = {ω i }, where ω i represents the beginning time of the i-th segment.…”

Section: Segmentation and Pitch Drift Estimationmentioning

confidence: 99%

“…The method achieves an F1 score of 0.76 on a collection of folk music recordings. For more details on the algorithm and its evaluation, we direct the reader to [25].…”

Section: Segmentation and Pitch Drift Estimationmentioning

confidence: 99%

“…To be able to compare and align different parts of a song, we need to adjust the F0 estimates according to the estimated change of intonation. As pitch drift D is already estimated during segmentation [25], F0 values are adjusted in this step according to the estimated drift, as in:…”

Section: Adjusting For Pitch Driftmentioning

confidence: 99%

“…Alignment is performed with dynamic time warping (DTW) over segment pairs represented by respective piano roll excerpts P ω i . We use correlation distance (one minus correlation coefficient) as our local distance measure, which has been shown to perform well for this task (as shown in [25,26]). To be more robust to individual incorrectly performed notes and other pitch fluctuations, such as vibrato or pitch changes during onset and offset, we smooth the piano roll representation with a Gaussian filter over the frequency axis prior to DTW calculation.…”

Section: Segment Alignment and Summarizationmentioning

confidence: 99%

See 3 more Smart Citations

Transcription of Polyphonic Vocal Music with a Repetitive Melodic Structure

Bohak¹,

Marolt²

2016

J. Audio Eng. Soc.

Self Cite

View full text Add to dashboard Cite

This paper presents a novel method for transcription of folk music that exploits its specifics to improve transcription accuracy. In contrast to most commercial music, folk music recordings may contain various inaccuracies as they are usually performed by amateur musicians and recorded in the field. If we use standard approaches for transcription, these inaccuracies are reflected in erroneous pitch estimates. On the other hand, the structure of western folk music is usually simple as songs are often composed of repeated melodic parts. In our approach we make use of these repetitions to increase transcription robustness and improve its accuracy. The proposed method fuses three sources of information: (1) frame-based multiple F0 estimates, (2) song structure, and (3) pitch drift estimates. It first selects a representative segment of the analyzed song and aligns all the other segments to it considering temporal as well as frequency deviations. Information from all segments is summarized and used in a two-layer probabilistic model based on explicit duration HMMs, to segment frame-based information into notes. The method is evaluated with state-of-the-art transcription methods where we show that significant improvement in accuracy can be achieved.

show abstract

Section: Segmentation and Pitch Drift Estimationmentioning

confidence: 99%

“…The method achieves an F1 score of 0.76 on a collection of folk music recordings. For more details on the algorithm and its evaluation, we direct the reader to [25].…”

Section: Segmentation and Pitch Drift Estimationmentioning

confidence: 99%

Section: Adjusting For Pitch Driftmentioning

confidence: 99%

Section: Segment Alignment and Summarizationmentioning

confidence: 99%

See 2 more Smart Citations