Interspeech 2013 2013
DOI: 10.21437/interspeech.2013-34
|View full text |Cite
|
Sign up to set email alerts
|

A quantitative comparison of glottal closure instant estimation algorithms on a large variety of singing sounds

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…In order to obtain the ground truth GCIs, we first pre-processed the EGG signals using 5th-order high and low-pass filters with respective cutoff frequencies of 30 and 500Hz. Then, the ground-truth GCI positions were obtained using peak-picking on the negative peaks of the derivative of the pre-processed EGG signal (as done in [12,21]). The pick-peaking was done using the code provided by the authors of [21], available online 2 (getpeaks function in dataloader.py) to have a comparable ground truth for evaluation.…”
Section: Cmu Artic Datasetmentioning
confidence: 99%
See 3 more Smart Citations
“…In order to obtain the ground truth GCIs, we first pre-processed the EGG signals using 5th-order high and low-pass filters with respective cutoff frequencies of 30 and 500Hz. Then, the ground-truth GCI positions were obtained using peak-picking on the negative peaks of the derivative of the pre-processed EGG signal (as done in [12,21]). The pick-peaking was done using the code provided by the authors of [21], available online 2 (getpeaks function in dataloader.py) to have a comparable ground truth for evaluation.…”
Section: Cmu Artic Datasetmentioning
confidence: 99%
“…Until recently, all approaches used to be based on hand-crafted digital signal processing techniques and heuristics. Thorough reviews of those techniques can be read in [11,12,1], where authors compared their performances on a variety of speech and singing signals. Typically, such methods first compute an intermediate speech representation, such as the linear prediction residual [13], a zero-frequency filtered signal [14], or a mean-based signal [15], which emphasizes the locations of glottal closure instants found at local maxima, impulses or discontinuities.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Although such approaches have been shown to perform reasonably well, they sometimes rely on manual parameter tuning (e. g., the mean f 0 for SEDREAMS) and the quality of their results remains quite dependant on the characteristics of the analysed speech signal (e. g., pitch and voice quality, speech or singing voice) [10]. Moreover, some algorithms like SEDREAMS [9] or DYPSA [6] also detect GCIs during unvoiced segments and thus rely on further algorithms to filter out GCI candidates in unvoiced parts.…”
Section: Introductionmentioning
confidence: 99%