2004
DOI: 10.1108/14684520410531673
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic and hierarchical classification of Web pages

Abstract: Automatic classification of Web pages is an effective way to organise the vast amount of information and to assist in retrieving relevant information from the Internet. Although many automatic classification systems have been proposed, most of them ignore the conflict between the fixed number of categories and the growing number of Web pages being added into the systems. They also require searching through all existing categories to make any classification. This article proposes a dynamic and hierarchical clas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2006
2006
2014
2014

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 21 publications
(11 citation statements)
references
References 10 publications
0
11
0
Order By: Relevance
“…This is done by (1) retrieve all HTML codes of the training web page, for which we use the Mathematica function: Import[url, -Source‖]. (2) We analyzed the HTML codes to determine the feature counts, for which we use the following Mathematica code:…”
Section: Implementation and Test Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…This is done by (1) retrieve all HTML codes of the training web page, for which we use the Mathematica function: Import[url, -Source‖]. (2) We analyzed the HTML codes to determine the feature counts, for which we use the following Mathematica code:…”
Section: Implementation and Test Resultsmentioning
confidence: 99%
“…The proposed system can achieve reasonable average classification accuracy, which provides the groundwork for future research. We incorporated this system with our other genre [11] and subject based [1]- [5]system to create a comprehensive web page classification system [2], [4], [14].…”
Section: Implementation and Test Resultsmentioning
confidence: 99%
“…Because our work is focused on a concept-identification method that involved ontology we do not describe the classifier-based techniques for identification of concepts. Some example of the work related to this topic can be found in [4], [5], [6], and [7].…”
Section: Related Workmentioning
confidence: 99%
“…In [7], a greedy probabilistic hierarchical classifier is used to determine the most suitable path in the hierarchy from the root to a leaf. In a second step, another classifier is used to determine the best class along this path.…”
Section: Hierarchical Classificationmentioning
confidence: 99%
“…Roughly speaking, the lower the quality of the estimates, the more conservative the classifier should become, for example by decreasing L. To illustrate, consider a situation, in which the flat classifier is extremely misleading: It predicts a probability close to 1 for a randomly selected class and distributes the remaining probability mass uniformly over the other classes. Computing the expected utility according to (7), the optimal prediction of the hierarchical classifier will be the class with probability close to 1. However, as this class is chosen at random by the flat classifier, the true performance of the hierarchical classifier will be very poor.…”
Section: Training the Classifiermentioning
confidence: 99%