2005
DOI: 10.1007/11362197_9
|View full text |Cite
|
Sign up to set email alerts
|

Web Page Classification*

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2006
2006
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 36 publications
(9 citation statements)
references
References 93 publications
0
7
0
Order By: Relevance
“…Earlier work has shown some indications that other elements may also have positive or negative effects on the relevance of their contents, such as text style elements (bold, italic, etc.) or anchor text [11,30]. Additionally, we could experiment with different weights for the different sizes of headers (h1 to h6).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Earlier work has shown some indications that other elements may also have positive or negative effects on the relevance of their contents, such as text style elements (bold, italic, etc.) or anchor text [11,30]. Additionally, we could experiment with different weights for the different sizes of headers (h1 to h6).…”
Section: Discussionmentioning
confidence: 99%
“…Although previous research on unsupervised keyphrase extraction from web pages is limited, it has repeatedly been shown that the HTML element in which a word or phrase appears correlates with its relevance for the topic of a page [11][12][13]. Thomaidou and Vazirgiannis proposed an unsupervised keyphrase recommendation system for web pages that used hypertext location information [14].…”
Section: Hypertext Documentsmentioning
confidence: 99%
“…More recently the ability of genre to improve retrieval from large collections of documents including the web has made it of considerable interest for IR and automatic classification. Genre theory informs one of the two approaches to automatic webpage classification, complementing classification based on content or subject (Choi and Yao, 2005). One of genres' major potentials in webpage classification is that they are especially suitable for collections made up of different types of documents.…”
Section: Introductionmentioning
confidence: 99%
“…We are interested in cluster analysis that can be used to organize Web pages into clusters based on their contents or genres (Choi & Yao, 2005). Clustering is an unsupervised discovery process for partitioning a set of data into clusters such that data in the same cluster is more similar to one another than classification methods, cluster analysis has the advantage that it does not require any training data (i.e., the labeled data), but can achieve the same goal in that it can classify similar Web pages into groups.…”
Section: Introductionmentioning
confidence: 99%