2010
DOI: 10.1080/17445760802429585
|View full text |Cite
|
Sign up to set email alerts
|

A segmentation method for web page analysis using shrinking and dividing

Abstract: On the basis of image processing technology and characteristics of web pages, a new web segmentation method -iterated shrinking and dividing is proposed in this paper. Dividing conditions and concept of dividing zone are introduced, based on which web page image is divided into visually consentaneous sub-images by shrinking and splitting iteratively. First, the web page is saved as image that is preprocessed by edge detection algorithm such as Canny. Then dividing zones are detected and the web image is segmen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0
1

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 46 publications
(30 citation statements)
references
References 9 publications
0
29
0
1
Order By: Relevance
“…Our segmentation technique involves directing the Web page classifier to bounded areas of a Web Page via recursive division; a technique also utilized by Cao et al [2010] in what they described as their "iterative shrinking and dividing" strategy. These bounded areas are defined by the longest frequent patterns (LFPs) of HTML sequences within each region.…”
Section: Discussionmentioning
confidence: 99%
“…Our segmentation technique involves directing the Web page classifier to bounded areas of a Web Page via recursive division; a technique also utilized by Cao et al [2010] in what they described as their "iterative shrinking and dividing" strategy. These bounded areas are defined by the longest frequent patterns (LFPs) of HTML sequences within each region.…”
Section: Discussionmentioning
confidence: 99%
“…The results for this experiment are given in Table 2. The highlighted rows ( Scene 1,2,17,18,19 and 20) refer to scenes in which no errors were introduced. As in the first experiment, the two last columns are the most interesting ones.…”
Section: Resultsmentioning
confidence: 99%
“…An image processing based segmentation approach is illustrated in [19]. The segmentation process based text density of the contents is explained in [20].…”
Section: Web Page Segmentationmentioning
confidence: 99%