2021
DOI: 10.1007/s00500-021-05816-z
|View full text |Cite|
|
Sign up to set email alerts
|

RETRACTED ARTICLE: Efficiently harvesting deep web interfaces based on adaptive learning using two-phase data crawler framework

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 13 publications
0
5
0
Order By: Relevance
“…Among many programming languages, Python is the most widely used scripting language for writing web crawler programs. Based on Python language, there are many excellent libraries and crawler frameworks, such as scratch, beautiful soup, Crawley, Python goose, and mechanize [15,16]. Python, an assembly language, will provide a web page interface that can be used directly to facilitate the operation of crawler code.…”
Section: Methodsmentioning
confidence: 99%
“…Among many programming languages, Python is the most widely used scripting language for writing web crawler programs. Based on Python language, there are many excellent libraries and crawler frameworks, such as scratch, beautiful soup, Crawley, Python goose, and mechanize [15,16]. Python, an assembly language, will provide a web page interface that can be used directly to facilitate the operation of crawler code.…”
Section: Methodsmentioning
confidence: 99%
“…Each online platform is composed of different degrees of information producers, such as organized OGC (professional content production), relatively specialised PGC (professional content production), and individual UGC (user produced content), and information creators of all kinds have more diverse information sources and information presentation methods than traditional media in the past, but rigour and standardisation are seriously lacking. In an era, when the flow is king, public figures and "big V" focus on the number of clicks, retweets, likes, and favorites, all hoping to get a share of the "attention" resources [16]. Against this backdrop, there were several problems with the coverage of Chinese people during the epidemic control period: first is about the amplification of individual incidents.…”
Section: Communication Factorsmentioning
confidence: 99%
“…In comparison to the usual design, the implemented technique crawls, allowing for more equitable data distribution through plug-and-plug. When compared with the existing methods such as IHCM [16], SVM [17], NTPDCF [18], and SIMHAR [21], the implemented hamming distance method achieves better performance values of 99.8%, 99.9%, 98%, and 99% in terms of accuracy, precision, recall, and f-measure.…”
Section: Discussionmentioning
confidence: 97%
“…This section provides a discussion of the implemented hamming distance and compares those results with existing methods such as IHCM [16], SVM [17], NTPDCF [18], and SIMHAR [21] in the comparative analysis section. The main goal of the hamming distance is to avoid becoming the bottleneck in the pipeline, these algorithms must be fast and efficient.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation