2016
DOI: 10.1109/tcc.2015.2440254
|View full text |Cite
|
Sign up to set email alerts
|

OverFlow: Multi-Site Aware Big Data Management for Scientific Workflows on Clouds

Abstract: International audienceThe global deployment of cloud datacenters is enabling large scale scientific workflows to improve performance and deliver fast responses. This unprecedented geographical distribution of the computation is doubled by an increase in the scale of the data handled by such applications, bringing new challenges related to the efficient data management across sites. High throughput, low latencies or cost-related trade-offs are just a few concerns for both cloud providers and users when it comes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 33 publications
(7 citation statements)
references
References 23 publications
0
7
0
Order By: Relevance
“…Various methods of technical and organizational nature were developed to solve this problem: load redistribution by the communication channels [8,9], compression of transmitted data [10], statistical processing of measurements [11] and others [12]. However, there is a not solved issue of ensuring the required accuracy of measurements when achieving the goal of reducing the total computational burden.…”
Section: Literature Review and Problem Statementmentioning
confidence: 99%
“…Various methods of technical and organizational nature were developed to solve this problem: load redistribution by the communication channels [8,9], compression of transmitted data [10], statistical processing of measurements [11] and others [12]. However, there is a not solved issue of ensuring the required accuracy of measurements when achieving the goal of reducing the total computational burden.…”
Section: Literature Review and Problem Statementmentioning
confidence: 99%
“…Frequently, a central storage site keeps long-term use data for pipelines. The common data access patterns of these pipelines include data dissemination, collection, and aggregation [37]. In addition, concurrent data write operations across different phases are very rare.…”
Section: Designmentioning
confidence: 99%
“…Accordingly, accelerating data analysis for each stage may require computing facilities that are located in different clouds. Between different stages of the geographical data pipeline, moving a large amount of data across clouds is common [5,37]. This type of multi-cloud environment can consist of resources from multiple public cloud vendors, such as Amazon AWS [1] and Microsoft Azure [29], and private data centers.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This brings us to the basic concept of Cloudets (Baciu et al, 2012). The ability to monitor and automatically redeploy resources (Baciu et al, 2015), within a cloud orchestration framework, aided by machine learning for predicting resource utilization and load rebalancing (Tudoran et al, 2016) makes the Cloudet architecture ideal for exploring the cloud vs brain metaphor in the context of the global Internet. (This section is contributed by Prof. George Baciu.…”
Section: Cognitive Convergence Of Intelligent Cloud Computingmentioning
confidence: 99%