2010 Second International Conference on Information Technology and Computer Science 2010
DOI: 10.1109/itcs.2010.76
|View full text |Cite
|
Sign up to set email alerts
|

An Approach of Extracting Web Information Based on HTMLParser

Abstract: Now many applications need to analyze various detail contents of web pages. How to extract web information quickly and effectively becomes very important. Web information is primarily expressed by HTML. HTMLParser is an open project of SourceForge.net and can parse HTML in either a linear or a nested fashion. This paper analyzes the principle of extracting web information based on HTMLParser. In addition, it gives an approach of implementing web information extraction with the classes and methods provided by H… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0
1

Year Published

2013
2013
2015
2015

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 2 publications
0
1
0
1
Order By: Relevance
“…Mineral glass is a recently developed material, made from volcanic lava, which is corrosion resistant, heat resistance, wear resistant, environmental friendly, impermeable and very strong. Basalt fiber is made from basalt rock, a typical volcanic lave, whish is also environmental friendly and impermeable [6]. In order to improve the adhesion of interface, basalt fiber and mineral glass powder were treated with silane coupling agent at the concentration of 0.75%.…”
Section: Introductionmentioning
confidence: 99%
“…Mineral glass is a recently developed material, made from volcanic lava, which is corrosion resistant, heat resistance, wear resistant, environmental friendly, impermeable and very strong. Basalt fiber is made from basalt rock, a typical volcanic lave, whish is also environmental friendly and impermeable [6]. In order to improve the adhesion of interface, basalt fiber and mineral glass powder were treated with silane coupling agent at the concentration of 0.75%.…”
Section: Introductionmentioning
confidence: 99%
“…A pesquisa foi concentrada, principalmente, nas áreas de navegação e visualização de formatos Linked Data, além de trabalhos que visam extração de informações em páginas Web.Lin e Hu[14] apresentam o HTMLParser, que é um método para analisar páginas HTML e efetivamente extrair conteúdos de forma linear ou aninhada. O parser possui filtros e tags personalizadas, oferecendo uma interface de utilização simples.…”
unclassified