A tool-supported method to extract data and schema from Web sites

Estiévenart, Fabrice; Francois, A.; Henrard, Jean; Hainaut, Jean-Luc

doi:10.1109/wse.2003.1234003

Cited by 19 publications

(8 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On the other hand, in a top-down approach, high-level blocks are first declared before their inner (leaf) content. In a previous work [8], we show that this approach can be optimally used in the context of Web sites re-engineering (e.g., Web data migration towards a database) or when complex data structures need to be declared. According to the exploitation of the extracted data, one approach will always be preferred to the other but we are working on the integration of both views to get a multipurpose environment.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Semi-Automated Extraction of Targeted Data fromWeb Pages

Estiévenart

Meurisse

Hainaut

et al. 2006

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Self Cite

View full text Add to dashboard Cite

Section: Discussionmentioning

confidence: 99%

“…Finally, we suggested, in a previous paper [8], a complementary approach for Web data extraction and schema generation. In this work, the mapping between HTML and XML was realized by means of a META file, i.e., an XML representation of page clusters based on the source HTML structure.…”

Section: Related Workmentioning

confidence: 99%

Semi-Automated Extraction of Targeted Data fromWeb Pages

Estiévenart

Meurisse

Hainaut

et al. 2006

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Self Cite

View full text Add to dashboard Cite

“…Lixto [28] is a wrapper generation tool that is well suitable for building HTML/XML wrappers. Moreira et al [29] propose an approach to integrating WWW information, which is based on the development of a canonical domain model in XML and the wrapping of existing WWW applications with wrappers capable of communicating about entities in this common model with the applications and with an intermediary mediator.…”

Section: Resource Oriented Software Evolutionmentioning

confidence: 99%

Linking Functions and Quality Attributes for Software Evolution

Yang

Zheng

Chu

et al. 2012

2012 19th Asia-Pacific Software Engineering Conference

View full text Add to dashboard Cite

Software quality properties, normally derived from non-functional requirements, are becoming more important for software. A main reason for software evolution is the unsatisfaction to software quality properties. When improving these properties through software evolution, it is essential to know whether software functions are affected and by how much. This paper proposes an approach to linking the functions with the quality properties of software for evolution via software architecture styles, aiming at contributing to (1) predicting evolution efforts and (2) transforming software for improving its quality.

show abstract

“…In practice, most conceptual schemes of information systems and databases are developed essentially from zero. However, over the last decade, several approaches have emerged, with the objective of maintenance Web oriented applications based on the reverse engineering process [1]; [2]; [3]; [4]; [5]; [6]; [7].…”

Section: Introductionmentioning

confidence: 99%

Semantic Indexing of Web Documents Based on Domain Ontology

Dennai¹,

Benslimane²

2015

IJITCS

View full text Add to dashboard Cite

Abstract-The first phase of reverse engineering of weboriented applications is the extraction of concepts hidden in HTML pages including tables, lists and forms, or marked in XML documents. In this paper, we present an approach to index semantically these two sources of information (HTML page and XML document) using on the one hand, domain ontology to validate the extracted concepts and on the other hand the similarity measurement between ontology concepts with the aim of enrichment the index. This approach will be conceived in three steps (modeling, attaching and Enrichment) and thereafter, it will be realized and implemented by examples. The obtained results lead to better re-engineering of web applications and subsequently a distinguished improvement in the web structuring.

show abstract

A tool-supported method to extract data and schema from Web sites

Cited by 19 publications

References 6 publications

Semi-Automated Extraction of Targeted Data fromWeb Pages

Semi-Automated Extraction of Targeted Data fromWeb Pages

Linking Functions and Quality Attributes for Software Evolution

Semantic Indexing of Web Documents Based on Domain Ontology

Contact Info

Product

Resources

About