2003
DOI: 10.1007/978-3-540-45275-1_35
|View full text |Cite
|
Sign up to set email alerts
|

Extracting Data behind Web Forms

Abstract: A significant and ever-increasing amount of data is accessible only by filling out HTML forms to query an underlying Web data source. While this is most welcome from a user perspective (queries are relatively easy and precise) and from a data management perspective (static pages need not be maintained and databases can be accessed directly), automated agents must face the challenge of obtaining the data behind forms. In principle an agent can obtain all the data behind a form by multiple submissions of the for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
30
0

Year Published

2005
2005
2021
2021

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 52 publications
(30 citation statements)
references
References 11 publications
0
30
0
Order By: Relevance
“…We also see an apparent solution to the issues explained so far in the "Semantic Search" research field 4 . There is a handful of related and distinct approaches, some of which we will introduce here:…”
Section: Related Workmentioning
confidence: 70%
“…We also see an apparent solution to the issues explained so far in the "Semantic Search" research field 4 . There is a handful of related and distinct approaches, some of which we will introduce here:…”
Section: Related Workmentioning
confidence: 70%
“…Thus, in order to arrive at much of the data we can process with the system we have proposed in this paper, we need to access the hidden Web, a problem on which we are currently working [LYE01,LESY02]. Once extracted, if the result is a we also plan to piece together all the components we have developed in our data-extraction work [DEG] into a comprehensive extraction tool.…”
Section: Resultsmentioning
confidence: 99%
“…In [9] the system is very efficient as it automatically generates new queries from the result of the previous queries but in this crawler the system is not properly indexed.…”
Section: Amentioning
confidence: 99%