2015
DOI: 10.5120/19297-0734
|View full text |Cite
|
Sign up to set email alerts
|

Web Document Segmentation for Better Extraction of Information: A Review

Abstract: This paper reviews the problem of web page segmentation. According to the recent studies, there exist different approaches used to segment the web page into multiple blocks. Segmentation of web document is an essential step for many applications, such as text classifications, clustering, extraction of information and searching. The study provided full description for each approach and showed its contribution to the work area of research. Also the paper discusses the variance between these approaches, explainin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 31 publications
0
1
0
Order By: Relevance
“…Segmentation of web pages and content extraction have been pivotal research areas for several years, given the exponential increase in information that necessitates processing. Numerous tactics have been devised to augment the efficiency and efficacy of data extraction and web segmentation tasks (Eldirdiery et al, 2015). This section elaborates on some of the most pertinent studies related to the proposed technique, emphasising block detection, content extraction, and the implementation of machine learning methodologies.…”
Section: Establishing Connections Between Content Blocks From Differe...mentioning
confidence: 99%
“…Segmentation of web pages and content extraction have been pivotal research areas for several years, given the exponential increase in information that necessitates processing. Numerous tactics have been devised to augment the efficiency and efficacy of data extraction and web segmentation tasks (Eldirdiery et al, 2015). This section elaborates on some of the most pertinent studies related to the proposed technique, emphasising block detection, content extraction, and the implementation of machine learning methodologies.…”
Section: Establishing Connections Between Content Blocks From Differe...mentioning
confidence: 99%