Multi-label classification of music by emotion

Trohidis, Konstantinos; Tsoumakas, Grigorios; Kalliris, George; Vlahavas, Ioannis

doi:10.1186/1687-4722-2011-426793

Cited by 312 publications

(327 citation statements)

References 34 publications

Supporting

Mentioning

308

Contrasting

Unclassified

Order By: Relevance

“…As such, this has been suggested heuristically in numerous works [10,11,13]. Here we have shown that this can be derived as an approximate maximizer of the composite likelihood of the model in Figure 2.…”

Section: Mim and Jmi Criteria Under Br Transformationmentioning

confidence: 85%

See 1 more Smart Citation

Information Theoretic Feature Selection in Multi-label Data through Composite Likelihood

Sechidis

Νικολάου

Brown

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. In this paper we present a framework to unify information theoretic feature selection criteria for multi-label data. Our framework combines two different ideas; expressing multi-label decomposition methods as composite likelihoods and then showing how feature selection criteria can be derived by maximizing these likelihood expressions. Many existing criteria, until now proposed as heuristics, can be reproduced from a single basis under the proposed framework. Furthermore we can derive new problem-specific criteria by making different independence assumptions over the feature and label spaces. One such derived criterion is shown experimentally to outperform other approaches proposed in the literature on real-world datasets.

show abstract

Section: Mim and Jmi Criteria Under Br Transformationmentioning

confidence: 85%

“…Trohidis et al [11] present a comparison between the J Y:full X:full and J Y:none X:full criteria, but using χ 2 -statistic instead of mutual information, while recently these criteria were re-introduced under the problem transformation approach [10]. Doquire & Verleysen [4] proposed J Y:none X:none .…”

Section: Connections With the Literaturementioning

confidence: 99%

Information Theoretic Feature Selection in Multi-label Data through Composite Likelihood

Sechidis

Νικολάου

Brown

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…We evaluate the performance of label compression and recovery, and multi-label prediction of CL on 21 datasets from different domains and of different scales, including Bibtex ), Corel5k (Duygulu et al 2002), Mediamill (Snoek et al 2006), IMDB (Read 2010), Enron (Tsoumakas 2010), Genbase (Diplaris et al 2005), Medical (Tsoumakas 2010), Emotions (Trohidis et al 2008), Scene (Boutell et al 2004), Slashdot (Read 2010) and 11 sub datasets included in Yahoo dataset (Ueda and Saito 2002). These datasets are collected from different practical problems such as text classification, image annotation, scene classification, music categorization, genomics and web page classification.…”

Section: Datasetsmentioning

confidence: 99%

Compressed labeling on distilled labelsets for multi-label learning

Zhou

Tao

2012

Mach Learn

View full text Add to dashboard Cite

Directly applying single-label classification methods to the multi-label learning problems substantially limits both the performance and speed due to the imbalance, dependence and high dimensionality of the given label matrix. Existing methods either ignore these three problems or reduce one with the price of aggravating another. In this paper, we propose a {0, 1} label matrix compression and recovery method termed "compressed labeling (CL)" to simultaneously solve or at least reduce these three problems. CL first compresses the original label matrix to improve balance and independence by preserving the signs of its Gaussian random projections. Afterward, we directly utilize popular binary classification methods (e.g., support vector machines) for each new label. A fast recovery algorithm is developed to recover the original labels from the predicted new labels. In the recovery algorithm, a "labelset distilling method" is designed to extract distilled labelsets (DLs), i.e., the frequently appeared label subsets from the original labels via recursive clustering and subtraction. Given a distilled and an original label vector, we discover that the signs of their random projections have an explicit joint distribution that can be quickly computed from a geometric inference. Based on this observation, the original label vector is exactly determined after performing a series of Kullback-Leibler divergence based hypothesis tests on the distribution about the new labels. CL significantly improves the balance Mach Learn (2012) 88:69-126 of the training samples and reduces the dependence between different labels. Moreover, it accelerates the learning process by training fewer binary classifiers for compressed labels, and makes use of label dependence via DLs based tests. Theoretically, we prove the recovery bounds of CL which verifies the effectiveness of CL for label compression and multi-label classification performance improvement brought by label correlations preserved in DLs. We show the effectiveness, efficiency and robustness of CL via 5 groups of experiments on 21 datasets from text classification, image annotation, scene classification, music categorization, genomics and web page classification.

show abstract

“…Multi-label classification algorithms [5] address that category of problems where multiple output classes must be selected for each input instance. Examples of multi-label classification include text [6] and music categorization [7]. The commonly used evaluation metrics for multi-label classification problems are hamming-loss, precision and recall.…”

Section: Introductionmentioning

confidence: 99%

Relevance as a Metric for Evaluating Machine Learning Algorithms

Gopalakrishna¹,

Özçelebi²,

Liotta

et al. 2013

Machine Learning and Data Mining in Pattern Recognition

View full text Add to dashboard Cite

In machine learning, the choice of a learning algorithm that is suitable for the application domain is critical. The performance metric used to compare different algorithms must also reflect the concerns of users in the application domain under consideration. In this work, we propose a novel probability-based performance metric called Relevance Score for evaluating supervised learning algorithms. We evaluate the proposed metric through empirical analysis on a dataset gathered from an intelligent lighting pilot installation. In comparison to the commonly used Classification Accuracy metric, the Relevance Score proves to be more appropriate for a certain class of applications.

show abstract

Multi-label classification of music by emotion

Cited by 312 publications

References 34 publications

Information Theoretic Feature Selection in Multi-label Data through Composite Likelihood

Information Theoretic Feature Selection in Multi-label Data through Composite Likelihood

Compressed labeling on distilled labelsets for multi-label learning

Relevance as a Metric for Evaluating Machine Learning Algorithms

Contact Info

Product

Resources

About