Comparing Models for Harmony Prediction in an Interactive Audio Looper

Wallace, Benedikte; Martín, Charles

doi:10.1007/978-3-030-16667-0_12

Cited by 1 publication

(3 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Latency Models Bias Models SongDriver PSCA-HMM [24] PSCA-RNN [24] Music Transformer [11] Markov-Lin [14] current melody more harmoniously compared with other latency models, which corroborates that the generated accompaniment of SongDriver is more logically synchronized with the melody. Besides, the CS score also justifies the high harmonic stability of SongDriver generated accompaniments, proving the effectiveness of the twophase strategy.…”

Section: Subjective Metricsmentioning

confidence: 78%

“…These existing models could be classified into two categories: latency models and bias models. Latency models can be represented by [24], which uses HMM and RNN to ensure the accuracy of the improvised accompaniment. But the accompaniment often slightly lags behind its corresponding melody, thus leading to a logical latency.…”

Section: Related Workmentioning

confidence: 99%

“…Previous automatic real-time accompaniment models are generally one-phase, and can be divided into two categories on the basis of the said two problems: 1) Latency models [23] [24], which avoid the exposure bias but suffer from the logical latency by arranging the accompaniment each time the input reaches a preset length. 2) Bias models [1] [12], which eliminate the logical latency but face the exposure bias by generating accompaniments for the upcoming melody.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

Wang

Zhang

Wang

et al. 2022

Proceedings of the 30th ACM International Conference on Multimedia

View full text Add to dashboard Cite

Real-time music accompaniment generation has a wide range of applications in the music industry, such as music education and live performances. However, automatic real-time music accompaniment generation is still understudied and often faces a trade-off between logical latency and exposure bias. In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias. Specifically, SongDriver divides one accompaniment generation task into two phases: 1) The arrangement phase, where a Transformer model first arranges chords for input melodies in real-time, and caches the chords for the next phase instead of playing them out. 2) The prediction phase, where a CRF model generates playable multi-track accompaniments for the coming melodies based on previously cached chords. With this two-phase strategy, SongDriver directly generates the accompaniment for the upcoming melody, achieving zero logical latency. Furthermore, when predicting chords for a timestep, SongDriver

show abstract

Section: Subjective Metricsmentioning

confidence: 78%

Section: Related Workmentioning

confidence: 99%