2021
DOI: 10.1007/s10619-021-07322-5
|View full text |Cite
|
Sign up to set email alerts
|

Parallel query processing in a polystore

Abstract: The blooming of different data stores has made polystores a major topic in the cloud and big data landscape. As the amount of data grows rapidly, it becomes critical to exploit the inherent parallel processing capabilities of underlying data stores and data processing platforms. To fully achieve this, a polystore should: (i) preserve the expressivity of each data store's native query or scripting language and (ii) leverage a distributed architecture to enable parallel data integration, i.e. joins, on top of pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 24 publications
0
6
0
Order By: Relevance
“…Polystores also provide seamless access to cloud data stores. The CloudMdsQL Polystore provides a functional SQL-like query language to access many data sources (relational, NoSQL, and HDFS) [70].…”
Section: Challenge Of the Future Management In Astronomical Data Arch...mentioning
confidence: 99%
“…Polystores also provide seamless access to cloud data stores. The CloudMdsQL Polystore provides a functional SQL-like query language to access many data sources (relational, NoSQL, and HDFS) [70].…”
Section: Challenge Of the Future Management In Astronomical Data Arch...mentioning
confidence: 99%
“…Some of the key features of Apache Drill are (i) Low latency queries, which means that a simple query returns the result in a few milliseconds; (ii) Ability to access multiple data sources in a single query, such as Hive tables, JSON files and file systems (local or distributed); and (iii) it works with Business Intelligence tools, which allows for direct integration with specialized visualization tools [Drill 2022]. Apache Drill is considered a Polystore system [Kranas et al 2021] since it can query multiple DBMSs that follow multiple data models.…”
Section: Apachementioning
confidence: 99%
“…Although the aforementioned data management solutions (i.e., PostGIS and Nano Cubes) have been successfully used in many Vis projects, the Database community has been developing a series of domain/problem agnostic solutions for data management in recent years that can be applied in the Vis context, especially in interactive Vis. Such solutions range from NoSQL DBMSs (e.g., MonetDB [Boncz et al 2006], MongoDB [Makris et al 2021]) to Polystore systems [Kranas et al 2021] and large-scale data analysis frameworks such as Apache Spark (with Spark SQL) [Zaharia et al 2016], in addition to well-known RDBMSs like PostgreSQL. Although the aforementioned solutions are not designed to support interactive Vis systems, they have been shown to be increasingly efficient and can be used in many contexts, replacing domain-specific solutions such as Nano Cubes.…”
Section: Introductionmentioning
confidence: 99%
“…5 Thus, in recent years, the database community has developed different solutions for large-scale data management, such as NoSQL DBMS, for example, MonetDB 6 and MongoDB, 7 and Polystore systems. 8 Nevertheless, the most successful data analytics solutions have been the big data frameworks, notably Apache Hadoop and its alternatives like Apache Spark. 9 Spark improves the performance of applications by automatically exploiting parallelism and accomplishing in-memory data movement, in contrast to Hadoop, whose intermediate operations write to the storage volume.…”
mentioning
confidence: 99%