Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1998
DOI: 10.1145/290941.290956
|View full text |Cite
|
Sign up to set email alerts
|

Web document clustering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2000
2000
2017
2017

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 743 publications
(28 citation statements)
references
References 12 publications
0
14
0
Order By: Relevance
“…They condense documents into a few words and phrases, offering a brief and precise description of a document's contents. They have many further applications, including the classification or clustering of documents (Jones & Mahoui, 2000; Zamir & Etzioni, 1998, 1999), search and browsing interfaces (Gutwin, Paynter, Witten, Nevill‐Manning, & Frank, 1999; Jones, 1999; Jones & Paynter, 1999), retrieval engines (Arampatzis, Tsoris, Koster, & Van der Weide, 1998; Croft, Turtle, & Lewis, 1991; Jones & Staveley, 1999), and thesaurus construction (Kosovac, Vanier, & Froese, 2000; Paynter, Witten, & Cunningham, 2000).…”
Section: Introductionmentioning
confidence: 99%
“…They condense documents into a few words and phrases, offering a brief and precise description of a document's contents. They have many further applications, including the classification or clustering of documents (Jones & Mahoui, 2000; Zamir & Etzioni, 1998, 1999), search and browsing interfaces (Gutwin, Paynter, Witten, Nevill‐Manning, & Frank, 1999; Jones, 1999; Jones & Paynter, 1999), retrieval engines (Arampatzis, Tsoris, Koster, & Van der Weide, 1998; Croft, Turtle, & Lewis, 1991; Jones & Staveley, 1999), and thesaurus construction (Kosovac, Vanier, & Froese, 2000; Paynter, Witten, & Cunningham, 2000).…”
Section: Introductionmentioning
confidence: 99%
“…Clusters are nodes of a suffix tree formed from suffix trees of the input documents (trees containing all suffixes of a string). An original STC [6] method has a great contextual dependence and low accuracy, so it has developed its DIG [7] modification, which precision is about 70%. Its shortcoming is a high price of the tree or graph building in the case of receiving documents by network [8].…”
Section: Hierarchical Methods (Single Link Complete Linkmentioning
confidence: 99%
“…A significant portion of the unstructured content collected from social media is text. Text mining techniques can be applied for automatic organization, navigation, retrieval, and summary of huge volumes of text documents [59][60][61]. This concept covers a number of topics and algorithms for text analysis including natural language processing (NLP), information retrieval, data mining, and machine learning [62].…”
Section: Text Analyticsmentioning
confidence: 99%