How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

Szymański, Piotr; Kajdanowicz, Tomasz; Kersting, Kristian

doi:10.3390/e18080282

Cited by 46 publications

(45 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition to this, RAkEL's computational complexity is high because the generated output spaces are label powersets and the underlying classification algorithm is a parameter, which can considerably change/worsen the training times (e.g., if we use SVMs instead of ordinary decision trees). This approach has been extended by Szymański et al (2016), where the authors propose not to use the original random partitioning of subsets as performed by RAkEL, but rather a data-driven approach. They propose to use community detection approaches from social networks to partition the label space, which can find better subspaces than random search.…”

Section: Related Workmentioning

confidence: 99%

Ensembles for multi-target regression with random output selections

2018

View full text Add to dashboard Cite

We address the task of multi-target regression, where we generate global models that simultaneously predict multiple continuous variables. We use ensembles of generalized decision trees, called predictive clustering trees (PCTs), in particular bagging and random forests (RF) of PCTs and extremely randomized PCTs (extra PCTs). We add another dimension of randomization to these ensemble methods by learning individual base models that consider random subsets of target variables, while leaving the input space randomizations (in RF PCTs and extra PCTs) intact. Moreover, we propose a new ensemble prediction aggregation function, where the final ensemble prediction for a given target is influenced only by those base models that considered it during learning. An extensive experimental evaluation on a range of benchmark datasets has been conducted, where the extended ensemble methods were compared to the original ensemble methods, individual multi-target regression trees, and ensembles of single-target regression trees in terms of predictive performance, running times and model sizes. The results show that the proposed ensemble extension can yield better predictive performance, reduce learning time or both, without a considerable change in model size. The newly proposed aggregation function gives best results when used with extremely randomized PCTs. We also include a comparison with three competing methods, namely random linear target combinations and two variants of random projections.

show abstract

Section: Related Workmentioning

confidence: 99%

Ensembles for multi-target regression with random output selections

2018

View full text Add to dashboard Cite

show abstract

“…Moreover, it has been confirmed that the data-driven method is superior to random selection for the label space division in multi-label classification problems [100]. Especially, the community detection method has been well applied to multiple benchmark data sets for multi-label learning, it divides the label space in a data-driven manner [100]. Thus, this study discusses the application of five classic community detection algorithms in DTIs prediction.…”

Section: Algorithms Of Multi-label Learningmentioning

confidence: 91%

“…To consider the correlation among labels informatively, the data-driven clustering algorithm is used instead of the random partition strategy. Moreover, it has been confirmed that the data-driven method is superior to random selection for the label space division in multi-label classification problems [100]. Especially, the community detection method has been well applied to multiple benchmark data sets for multi-label learning, it divides the label space in a data-driven manner [100].…”

Section: Algorithms Of Multi-label Learningmentioning

confidence: 98%

See 1 more Smart Citation

Predicting drug-target interactions using multi-label learning with community detection method (DTI-MLCD)

Chu

Shan²,

Salahub

et al. 2020

Preprint

View full text Add to dashboard Cite

Identifying drug-target interactions (DTIs) is an important step for drug discovery and drug repositioning. To reduce heavily experiment cost, booming machine learning has been applied to this field and developed many computational methods, especially binary classification methods. However, there is still much room for improvement in the performance of current methods. Multi-label learning can reduce difficulties faced by binary classification learning with high predictive performance, and has not been explored extensively. The key challenge it faces is the exponential-sized output space, and considering label correlations can help it. Thus, we facilitate the multi-label classification by introducing community detection methods for DTIs prediction, named DTI-MLCD. On the other hand, we updated the gold standard data set proposed in 2008 and still in use today. The proposed DTI-MLCD is performed on the gold standard data set before and after the update, and shows the superiority than other classical machine learning methods and other benchmark proposed methods, which confirms the efficiency of it. The data and code for this study can be found at https://github.com/a96123155/DTI-MLCD.[8-41], drug-target pairs and interactions are treated as samples and labels, respectively. It describes the drug-target pair by encoding drugs and targets as the feature vector, then, predicts DTIs by building a binary classifier. In addition to the binary classification methods, there are network inference methods [42][43][44][45][46][47][48][49][50][51][52][53][54][55], matrix factorization methods [56-63], kernel-based methods [64-68], restricted Boltzmann machine method [69], collaborative filtering method [70], clustering method [71], label propagation method [72], etc.It is worth noting that many of these other methods can be attributed to the binary classification method in a sense. For example, the network inference method regards the DTIs prediction problem as the bipartite network inference problem, and infers missing edges to achieve DTIs prediction. If the missing edges are regarded as negative samples and the existing edges are regarded as positive samples, it is converted into a binary classification problem.For the binary classification method, it requires the participation of positive and negative samples, so unknown DTIs are often treated as negative samples. This negative sample construction strategy will not only introduce noise but also cause data imbalance as a large number of negative samples. Besides, it is also faced with excessive computational load and overfitting due to the redundant feature space and extremely high feature dimensions. For example, 10 drugs and 10 targets will be combined into 10 × 10 = 100 samples, and the same drug or target in different samples has the same feature vector, that is, the feature vector of each drug or target will appear 10 times in the feature space of 100 samples. To reduce the above difficulties, the application of multi-label learning to DTI prediction problems is worth ...

show abstract

“…Among ensemble methods, Tsoumakas et al (2011) break the initial set of labels into a number of small random subsets and employ the LP algorithm to train a corresponding classifier. Szymański et al (2016) propose to construct a label co-occurrence graph and perform community detection to partition the label set.…”

Section: Related Workmentioning

confidence: 99%

Hierarchical Sequence-to-Sequence Model for Multi-Label Text Classification

Yang

Liu

2019

IEEE Access

View full text Add to dashboard Cite

Multi-label classification is an important yet challenging task in natural language processing. It is more complex than single-label classification in that the labels tend to be correlated. Existing methods tend to ignore the correlations between labels. Besides, different parts of the text can contribute differently to predicting different labels, which is not considered by existing models. In this paper, we propose to view the multi-label classification task as a sequence generation problem, and apply a sequence generation model with a novel decoder structure to solve it. Extensive experimental results show that our proposed methods outperform previous work by a substantial margin. Further analysis of experimental results demonstrates that the proposed methods not only capture the correlations between labels, but also select the most informative words automatically when predicting different labels. 1 1 The datasets and code are available at https://github.com/lancopku/SGM This work is licenced under a Creative Commons Attribution 4.0 International Licence. Licence details:

show abstract

How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

Cited by 46 publications

References 25 publications

Ensembles for multi-target regression with random output selections

Ensembles for multi-target regression with random output selections

Predicting drug-target interactions using multi-label learning with community detection method (DTI-MLCD)

Hierarchical Sequence-to-Sequence Model for Multi-Label Text Classification

Contact Info

Product

Resources

About