2014 13th International Conference on Machine Learning and Applications 2014
DOI: 10.1109/icmla.2014.88
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Music and Lyrics Fusion Classifier for Artist Identification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 12 publications
0
5
0
Order By: Relevance
“…This group thinks of multimodality as different communication methods: visual, linguistic, spatial, aural, and gestural (11). Multimodal music datasets within this context may incorporate audio, gestures, and written documents, such as lyrics (12).…”
Section: Existing Definitionsmentioning
confidence: 99%
“…This group thinks of multimodality as different communication methods: visual, linguistic, spatial, aural, and gestural (11). Multimodal music datasets within this context may incorporate audio, gestures, and written documents, such as lyrics (12).…”
Section: Existing Definitionsmentioning
confidence: 99%
“…The audio domain has 3 articles, where recognition [391], detection [392] and translation [393] each have 1. Detection [394], prediction [395] and translation [396] each have one paper from 3 papers in the game domain.…”
Section: Inclusion Criteriamentioning
confidence: 99%
“…[98], [100], [104], [116], [125], [127], [187], [206], [241], [326], [365], [395], [411], Image & Numerical [62], [75], [119], [126], [167], [313], [331], [353], [405], [410], Audio & Text & Sensor [384], Audio & Text [180], [282], [377], [391], [392], Text & Signal [109], Text & Numerical [304], [349], Sensor & Signal [240], [242], [258], [389], Sensor & Numerical [183], Signal & Numerical [205], [257], [260], [318]. Figure 10 displays the extracted information related to each modality and data type with the links between them.…”
Section: B Taskmentioning
confidence: 99%
“…• artist identification, through lyrics and audio fusion [38]; • derivative works classification of youtube video through audio, video, titles and authors [39]; • instrument classification by exploiting audio recordings and performance video [40], [41]; • tonic identification, that is: given an audio recording and the note level, find the tonic [42]; • expressive musical description, which consists in associating a musical annotation to an audio recording by extracting features with the help of symbolic level [43].…”
Section: Classificationmentioning
confidence: 99%
“…conversion from one modality to the other, such as in query-by-humming -which includes a conversion from audio to the symbolic level -or in audioto-score alignment where symbolic scores can be converted to audio through a synthesis process. feature selection through Linear Discriminant Analysis (LDA) [28] or ReliefF [43] normalization of the extracted features [48] source-separation in lyrics-to-audio alignment and source association [63], [64] chord labeling on audio only [62] multi-pitch estimation on audio only [65] video-based hand tracking [59] tf-idf -based statistics -see section V-C -adapted for audio [38] Finally, we think that a step worthy of a particular attention is the conversion to a common space of the extracted features, to make them comparable. We will talk about this step in section VI.…”
Section: Data Pre-processingmentioning
confidence: 99%