ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053128
|View full text |Cite
|
Sign up to set email alerts
|

Neural Percussive Synthesis Parameterised by High-Level Timbral Features

Abstract: We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds. This approach allows for intuitive control of a synthesizer, enabling the user to shape sounds without extensive knowledge of signal processing. We use a feedforward convolutional neural networkbased architecture, which is able to map input parameters to the corresponding waveform. We propose two datasets to evaluate our approach on both a restrictive context… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(38 citation statements)
references
References 7 publications
0
38
0
Order By: Relevance
“…The Wave-U-Net architecture works with fixed-length input and output, requiring all loops to have the same length. To this end, we use the rubberband library 6 to time-stretch all the loops to 130BPM. We verified that this time-stretch did not create artifacts and that the loops were still sounding coherent after this step.…”
Section: Dataset Curation and Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…The Wave-U-Net architecture works with fixed-length input and output, requiring all loops to have the same length. To this end, we use the rubberband library 6 to time-stretch all the loops to 130BPM. We verified that this time-stretch did not create artifacts and that the loops were still sounding coherent after this step.…”
Section: Dataset Curation and Analysismentioning
confidence: 99%
“…This feature represents the energy distribution across the 12-note chromatic scale commonly used in western music and is also intuitive for music makers. To model the abstract non-harmonic texture of the loop, we used the perceptually pertinent timbral features proposed by Pearce et al [6,23]. These are hardness, depth, brightness, roughness, boominess, warmth and sharpness 9 .…”
Section: Global Conditioning Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…Hellmer et al [15] introduced methods to quantify the measures of microtiming of a drum track, which is also adopted in this work. Although not performance-related, there exist other works on drums, such as beat generation [16,17], drum sound synthesis [18,19] and interface design [20,21].…”
Section: Related Workmentioning
confidence: 99%
“…At generation time, this conditional signal can be varied as a means of user control over the generated output. Along this line, models for audio synthesis have been conditioned on pitch information [36,38], categorical semantic tags [42], lyrics [5], or on perceptual features [43,44]. Other approaches use conditioning in symbolic music generation [45], or constraints during generation instead of conditional information during training [7,46].…”
Section: Related Workmentioning
confidence: 99%