2017
DOI: 10.3906/elk-1501-236
|View full text |Cite
|
Sign up to set email alerts
|

New use of the HITS algorithm for fast web page classification

Abstract: Abstract:The immense number of documents published on the web requires the utilization of automatic classifiers that allow organizing and obtaining information from these large resources. Typically, automatic web pages classifiers handle millions of web pages, tens of thousands of features, and hundreds of categories. Most of the classifiers use the vector space model to represent the dataset of web pages. The components of each vector are computed using the term frequency inversed document frequency (TFIDF) s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…The modified HITS (M‐HITS) algorithm is designed based on the HITS algorithm to measure the statistical analysis value of each node. The M‐HITS algorithm computes two separate values for each node, the out‐CRn oc n and the in‐CRn ic n .…”
Section: Methodsmentioning
confidence: 99%
“…The modified HITS (M‐HITS) algorithm is designed based on the HITS algorithm to measure the statistical analysis value of each node. The M‐HITS algorithm computes two separate values for each node, the out‐CRn oc n and the in‐CRn ic n .…”
Section: Methodsmentioning
confidence: 99%
“…Meanwhile, TC techniques have been applied in various contexts such as web page classification, authorship attribution, knowledge management, and spam email detection. For example, Qi and Davison (2009), Kiziloluk and Ozer (2017), and Meadi et al (2017) explored algorithms and features in web page classification; Li et al (2017) and Saleh et al (2017) focused on the application of semantics-based approaches in web page classification. In the field of authorship attribution, in addition to traditional unsupervised methods such as Burrows’ delta (Burrows, 2002), an increasing number of studies have employed machine learning based classification techniques and reported promising results (Ebrahimpour et al, 2013; Jockers et al, 2008; Posadasduran et al, 2017; Tsimboukakis & Tambouratzis, 2010).…”
Section: Introductionmentioning
confidence: 99%