2017
DOI: 10.15439/2017f84
|View full text |Cite
|
Sign up to set email alerts
|

Document Clustering using a Graph Covering with Pseudostable Sets

Abstract: Abstract-In text mining, document clustering describes the efforts to assign unstructured documents to clusters, which in turn usually refer to topics. Clustering is widely used in science for data retrieval and organisation. In this paper we present a new graph theoretical approach to document clustering and its application on a real-world data set. We will show that the wellknown graph partition to stable sets or cliques can be generalized to pseudostable sets or pseudocliques. This allows to make a soft clu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
2
1
1
1

Relationship

3
2

Authors

Journals

citations
Cited by 6 publications
(14 citation statements)
references
References 21 publications
0
14
0
Order By: Relevance
“…The Integer Linear Program Approach, introduced in [1] and implemented in DocClustering, can calculate optimal solutions of the multiple pseudostable sets partitioning problem (minMPS'-a-IP as given in [1]). It defines constraints to all relevant steps of the underlying graph partition and has exponential runtime.…”
Section: Integer Linear Programmentioning
confidence: 99%
See 4 more Smart Citations
“…The Integer Linear Program Approach, introduced in [1] and implemented in DocClustering, can calculate optimal solutions of the multiple pseudostable sets partitioning problem (minMPS'-a-IP as given in [1]). It defines constraints to all relevant steps of the underlying graph partition and has exponential runtime.…”
Section: Integer Linear Programmentioning
confidence: 99%
“…It is used to assign documents to clusters, which usually represent topics. In [1] a novel approach using graph theory to document clustering is presented by Dörpinghaus et al and discussed together with its application on a real-world data set retrieved from the PubMed database. It is shown that this method is superior to conventional algorithms and provides more accurate clustering on biological and medical data, depending on the chosen similarity measure.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations