Héléna Galhardas scite author profile

Héléna Galhardas

5Publications

61Citation Statements Received

57Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Lisbon, Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento, French Institute for Research in Computer Science and Automation

Publications

Order By: Most citations

Graph integration of structured, semistructured and unstructured data for data journalism

Anadiotis

Balalau

Conceiç{ã}o

et al. 2022

Information Systems

View full text Add to dashboard Cite

Nowadays, journalism is facilitated by the existence of large amounts of digital data sources, including many Open Data ones. Such data sources are extremely heterogeneous, ranging from highly structured (relational databases), semi-structured (JSON, XML, HTML), graphs (e.g., RDF), and text. Journalists (and other classes of users lacking advanced IT expertise, such as most non-governmentalorganizations, or small public administrations) need to be able to make sense of such heterogeneous corpora, even if they lack the ability to de ne and deploy custom extract-transform-load work ows. These are di cult to set up not only for arbitrary heterogeneous inputs, but also given that users may want to add (or remove) datasets to (from) the corpus.We describe a complete approach for integrating dynamic sets of heterogeneous data sources along the lines described above: the challenges we faced to make such graphs useful, allow their integration to scale, and the solutions we proposed for these problems. Our approach is implemented within the ConnectionLens system; we validate it through a set of experiments.

show abstract

Efficient development of data migration transformations

Carreira¹,

Galhardas²

2004

View full text Add to dashboard Cite

Data Mapper: An Operator for Expressing One-to-Many Data Transformations

Carreira¹,

Galhardas²,

Pereira³

et al. 2005

View full text Add to dashboard Cite

Abstract. Transforming data is a fundamental operation in application scenarios involving data integration, legacy data migration, data cleaning, and extract-transform-load processes. Data transformations are often implemented as relational queries that aim at leveraging the optimization capabilities of most RDBMSs. However, relational query languages like SQL are not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. This class of data transformations is required for solving the data heterogeneities that occur when source data represents an aggregation of target data. In this paper, we propose and formally define the data mapper operator as an extension of the relational algebra to address one-to-many data transformations. We supply an algebraic rewriting technique that enables the optimization of data transformation expressions that combine filters expressed as standard relational operators with mappers. Furthermore, we identify the two main factors that influence the expected optimization gains.

show abstract

On-demand big data integration

Kathiravelu

Sharma

Galhardas

et al. 2018

Distrib Parallel Databases

View full text Add to dashboard Cite

Execution of data mappers

Carreira

Galhardas

2004

View full text Add to dashboard Cite

Data mappers are essential operators for implementing data transformations supporting schema mapping and integration scenarios such as legacy data migration, ETL processes for data warehousing, data cleaning activities, and business integration initiatives. Despite their widespread use, no formalization of this important operation has been proposed so far. In this paper we propose the data mapper operator as an extension to the relational algebra. We supply a set of algebraic rewriting rules for optimizing queries that combine standard relational operators with data mappers. Finally, we propose algorithms for their efficient physical execution.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.