2002
DOI: 10.1109/mis.2002.1024745
|View full text |Cite
|
Sign up to set email alerts
|

Marie-4: a high-recall, self-improving Web crawler that finds images using captions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2002
2002
2021
2021

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 24 publications
(10 citation statements)
references
References 9 publications
0
10
0
Order By: Relevance
“…In 2002 Neil C. Rowe, proposed a intelligent agent Web crawler [20] and caption filter, searches the Web to find image captions and the associated image objects. He mainly searches the clues words from the captions of Meta data, and other text clues except captions.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In 2002 Neil C. Rowe, proposed a intelligent agent Web crawler [20] and caption filter, searches the Web to find image captions and the associated image objects. He mainly searches the clues words from the captions of Meta data, and other text clues except captions.…”
Section: Related Workmentioning
confidence: 99%
“…He mainly searches the clues words from the captions of Meta data, and other text clues except captions. It uses a broad set of criteria to yield higher recall than competing systems, which generally focus on high precision [20].…”
Section: Related Workmentioning
confidence: 99%
“…In many cases the media can be inferred to be decorative and can be eliminated, as for many banners and sidebars on pages as well as background sounds. Simple criteria can distinguish decorative graphics from photographs (Rowe, 2002): size (photographs are larger), frequency of the most common color (graphics have a higher frequency), number of different colors (photographs have more), extremeness of the colors (graphics are more likely to have pure colors), and average variation in color between adjacent pixels in the image (photographs have less). (Hu and Bagga, 2004) extends this to classify images in order of importance as "story", "preview", "host", "commercial", "icons and logos", "headings", and "formatting".…”
Section: Content Rating By Importancementioning
confidence: 99%
“…(Rowe, 2002). However, these techniques currently are used primarily for information retrieval on the Web, rather than for Web mining.…”
Section: Future Directionsmentioning
confidence: 99%