2006
DOI: 10.1016/j.ipm.2006.03.017
|View full text |Cite
|
Sign up to set email alerts
|

Text mining without document context

Abstract: We consider a challenging clustering task: the clustering of muti-word terms without document co-occurrence information in order to form coherent groups of topics. For this task, we developed a methodology taking as input multi-word terms and lexico-syntactic relations between them. Our clustering algorithm, named CPCL is implemented in the TermWatch system. We compared CPCL to other existing clustering algorithms, namely hierarchical and partitioning (k-means, k-medoids). This out-of-context clustering task l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2006
2006
2018
2018

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 54 publications
(12 citation statements)
references
References 27 publications
0
12
0
Order By: Relevance
“…The CPCL algorithm has been evaluated against variants of hierarchical clustering and k-means and was found to produce more homogeneous clusters. See [3] for more details.…”
Section: Term Clusteringmentioning
confidence: 99%
“…The CPCL algorithm has been evaluated against variants of hierarchical clustering and k-means and was found to produce more homogeneous clusters. See [3] for more details.…”
Section: Term Clusteringmentioning
confidence: 99%
“…It has been shown that this variant of hierarchical clustering preserves its main ultrametric properties [15]. The clustering algorithm is implemented using a straightforward O(E) procedure called Select Local Maximum Edge (SLME) [16].…”
Section: Classification By Preferential Clustered Link (Cpcl)mentioning
confidence: 99%
“…Clustering usually aims at finding compact and clearly separated clusters. Clustering techniques have been applied in domains such as sensory time series (Yin and Yang, 2005) and text mining (SanJuan and Ibekwe-SanJuan, 2006). This paper presents an iterative methodology for supporting system identification using clustering.…”
Section: Introductionmentioning
confidence: 99%