2014
DOI: 10.5267/j.msl.2013.12.033
|View full text |Cite
|
Sign up to set email alerts
|

Document features selection using background knowledge and word clustering technique

Abstract: By everyday development of storage and communicational and electronic media, there are significant amount of information being collected and stored in different forms such as electronic documents and document databases makes it difficult to process them, properly. To extract knowledge from this large volume of documental data, we require the use of documents organizing and indexing methods. Among these methods, we can consider clustering and classification methods where the objective is to organize documents a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 16 publications
0
0
0
Order By: Relevance
“…Among these methods, although K-means is widely used, due to its hard clustering nature and sensitivity to noise, some researchers have proposed more flexible soft clustering algorithms, such algorithm is the fuzzy C-means clustering algorithm (FCM), which can accommodate uncertainties related to data points. There are also clustering algorithms such as PFCM [39] and KFCM [40]. In addition, there is the FuzBin-based binarization method, which Annabestani et al [41] use to extract text information from document images.…”
Section: Image Feature Methodsmentioning
confidence: 99%
“…Among these methods, although K-means is widely used, due to its hard clustering nature and sensitivity to noise, some researchers have proposed more flexible soft clustering algorithms, such algorithm is the fuzzy C-means clustering algorithm (FCM), which can accommodate uncertainties related to data points. There are also clustering algorithms such as PFCM [39] and KFCM [40]. In addition, there is the FuzBin-based binarization method, which Annabestani et al [41] use to extract text information from document images.…”
Section: Image Feature Methodsmentioning
confidence: 99%