2022
DOI: 10.30630/joiv.6.4.1525
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Web Scraping Performance Using XPath, CSS Selector, Regular Expression, and HTML DOM With Multiprocessing Technical Applications

Abstract: Data collection has become a necessity today, especially since many sources of data on the internet can be used for various needs. The main activity in data collection is collecting quality information that can be analyzed and used to support decisions or provide evidence. The process of retrieving data from the internet is also known as web scraping. There are various methods of web scraping that are commonly used. The amount of data scattered on the internet will be quite time-consuming if the web scraping i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 16 publications
0
1
0
1
Order By: Relevance
“…As regex foram utilizadas para auxiliar na seleção de elementos nas páginas, uma vez que estudos [4,11] demonstram o menor consumo de memória com uso dessa técnica dentre as mais populares de web scraping: HTML DOM, XPath, seletores CSS e regex.…”
Section: Percurso Metodológicounclassified
“…As regex foram utilizadas para auxiliar na seleção de elementos nas páginas, uma vez que estudos [4,11] demonstram o menor consumo de memória com uso dessa técnica dentre as mais populares de web scraping: HTML DOM, XPath, seletores CSS e regex.…”
Section: Percurso Metodológicounclassified
“…This approach would ensure accessibility across various devices, including mobile platforms. The suitability of a web application is further reinforced by the effectiveness of webpage data representation (DOM, HTML, and CSS) in crafting responsive and interactive surveys that dynamically adapt their content to present educational information upon completion [20]. Figure 3.…”
Section: React App Implementationmentioning
confidence: 99%