2022
DOI: 10.1007/978-3-031-16449-1_33
|View full text |Cite
|
Sign up to set email alerts
|

Conditional Generative Data Augmentation for Clinical Audio Datasets

Abstract: In this work, we propose a novel data augmentation method for clinical audio datasets based on a conditional Wasserstein Generative Adversarial Network with Gradient Penalty (cWGAN-GP), operating on log-mel spectrograms. To validate our method, we created a clinical audio dataset which was recorded in a real-world operating room during Total Hip Arthroplasty (THA) procedures and contains typical sounds which resemble the different phases of the intervention. We demonstrate the capability of the proposed method… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(14 citation statements)
references
References 25 publications
0
14
0
Order By: Relevance
“…The value of r is a hyperparameter and for our method it was chosen equal to 16 The generator has an overall of 1, 537, 316 parameters. For the discriminator we use a fully convolutional network architecture with a total of 4, 321, 153 parameters analogous to our own previous work [12]. Both the generator and discriminator employ the LeakyReLU non-linear activation function throughout the whole network structure.…”
Section: Proposed Data Augmentation Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The value of r is a hyperparameter and for our method it was chosen equal to 16 The generator has an overall of 1, 537, 316 parameters. For the discriminator we use a fully convolutional network architecture with a total of 4, 321, 153 parameters analogous to our own previous work [12]. Both the generator and discriminator employ the LeakyReLU non-linear activation function throughout the whole network structure.…”
Section: Proposed Data Augmentation Methodsmentioning
confidence: 99%
“…Channel attention has been successfully exploited to model channel level dependencies and facilitate learning of less redundant features [16] [17] [18] and subsequently improved model performance. Motivated by these observations, in this paper, we demonstrate that due to the huge number of model parameters, conditional generative adversarial network (cWGAN-GP [12]) learns redundant features. To combat this, we introduce a channel-wise attention mechanism in the generator sub-network through the implementation of Squeeze & Excitation [16] block and residual skip connections [19].…”
Section: Introductionmentioning
confidence: 97%
“…We use a publicly available data set 4 [12] recorded during real Total Hip Arthroplasty surgeries and contains sounds of the typical surgical actions that are performed during the intervention and roughly resemble the different phases of the procedure. The data set includes 568 recordings with a length of 1 s to 31 s and the following distribution: n raw,Adjustment = 68, n raw,Coagulation = 117, n raw,Insertion = 76, n raw,Reaming = 64, n raw,Sawing = 21, and n raw,Suction = 222.…”
Section: Data Set Preprocessing and Benchmark Augmentationsmentioning
confidence: 99%
“…Channel attention has been successfully exploited to model channel level dependencies and facilitate learning of less redundant features [16] [17] [18] and subsequently improved model performance. Motivated by these observations, in this paper, we demonstrate that due to the huge number of model parameters, conditional generative adversarial network (cWGAN-GP [12]) learns redundant features. To combat this, we introduce a channel-wise attention mechanism in the generator sub-network through the implementation of Squeeze & Excitation [16] block and residual skip connections [19].…”
Section: Introductionmentioning
confidence: 97%
See 1 more Smart Citation