Interspeech 2015 2015
DOI: 10.21437/interspeech.2015-193
|View full text |Cite
|
Sign up to set email alerts
|

Fisher vectors with cascaded normalization for paralinguistic analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 39 publications
(11 citation statements)
references
References 30 publications
0
11
0
Order By: Relevance
“…The UAR of the baseline training partition, set using LOSO-CV, was 61.3%, and the test set UAR 65.9%. The winning entry to the challenge was obtained using a system based on Fisher vector encoding of extracted features and cascaded normalisation to account for speaker and phonetic variability (Kaya et al, 2015). This approach achieved a test-set UAR of 83.1%.…”
Section: Ihearu-eat Databasementioning
confidence: 99%
See 1 more Smart Citation
“…The UAR of the baseline training partition, set using LOSO-CV, was 61.3%, and the test set UAR 65.9%. The winning entry to the challenge was obtained using a system based on Fisher vector encoding of extracted features and cascaded normalisation to account for speaker and phonetic variability (Kaya et al, 2015). This approach achieved a test-set UAR of 83.1%.…”
Section: Ihearu-eat Databasementioning
confidence: 99%
“…The UARs for the training and test partition for this system were 35.2% and 32.8%, respectively. Despite the additional information stream provided by the video data, no entrants to the 2018 challenge were able to better the UAR of 83.1% obtained in Kaya et al (2015). A detailed summary of the EAT challenge can be found in Schuller and Schuller (2020).…”
Section: Ihearu-eat Databasementioning
confidence: 99%
“…Despite the fact that just a handful of studies use FV in speech processing (e.g. for categorizing audio-signals as speech, music and others [29], for speaker verification [30,31], and for determining the food type from eating sounds [32]), we think that FV can be harnessed to improve classification performance in audio processing.…”
Section: Fisher Vectorsmentioning
confidence: 99%
“…However, the constituents and the combination rules of the ensemble systems must be selected with care. Kaya and colleagues previously applied the Fisher Vector (FV) encoding of acoustic Low-Level Descriptors (LLD) to several paralinguistic tasks including recognition of native language and sincerity [7], as well as classification of snoring types [8] and eating conditions [9]. In this work, we use a similar FV encoding for the representation of acoustic features.…”
Section: Introductionmentioning
confidence: 99%