2016
DOI: 10.3390/e18080282
|View full text |Cite
|
Sign up to set email alerts
|

How Is a Data-Driven Approach Better than Random Choice in Label Space Division for Multi-Label Classification?

Abstract: Abstract:We propose using five data-driven community detection approaches from social networks to partition the label space in the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd. We evaluate modularity-maximizing using fast greedy and leading eigenvector approximations, infomap, walktrap and label propagation algorithms. For this purpose, we propose to construct a label co-occurrence graph (both weighted and unweighted versions) based on tr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
45
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 46 publications
(45 citation statements)
references
References 25 publications
0
45
0
Order By: Relevance
“…In addition to this, RAkEL's computational complexity is high because the generated output spaces are label powersets and the underlying classification algorithm is a parameter, which can considerably change/worsen the training times (e.g., if we use SVMs instead of ordinary decision trees). This approach has been extended by Szymański et al (2016), where the authors propose not to use the original random partitioning of subsets as performed by RAkEL, but rather a data-driven approach. They propose to use community detection approaches from social networks to partition the label space, which can find better subspaces than random search.…”
Section: Related Workmentioning
confidence: 99%
“…In addition to this, RAkEL's computational complexity is high because the generated output spaces are label powersets and the underlying classification algorithm is a parameter, which can considerably change/worsen the training times (e.g., if we use SVMs instead of ordinary decision trees). This approach has been extended by Szymański et al (2016), where the authors propose not to use the original random partitioning of subsets as performed by RAkEL, but rather a data-driven approach. They propose to use community detection approaches from social networks to partition the label space, which can find better subspaces than random search.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, it has been confirmed that the data-driven method is superior to random selection for the label space division in multi-label classification problems [100]. Especially, the community detection method has been well applied to multiple benchmark data sets for multi-label learning, it divides the label space in a data-driven manner [100]. Thus, this study discusses the application of five classic community detection algorithms in DTIs prediction.…”
Section: Algorithms Of Multi-label Learningmentioning
confidence: 91%
“…To consider the correlation among labels informatively, the data-driven clustering algorithm is used instead of the random partition strategy. Moreover, it has been confirmed that the data-driven method is superior to random selection for the label space division in multi-label classification problems [100]. Especially, the community detection method has been well applied to multiple benchmark data sets for multi-label learning, it divides the label space in a data-driven manner [100].…”
Section: Algorithms Of Multi-label Learningmentioning
confidence: 98%
See 1 more Smart Citation
“…Among ensemble methods, Tsoumakas et al (2011) break the initial set of labels into a number of small random subsets and employ the LP algorithm to train a corresponding classifier. Szymański et al (2016) propose to construct a label co-occurrence graph and perform community detection to partition the label set.…”
Section: Related Workmentioning
confidence: 99%