2019
DOI: 10.1016/j.jnca.2019.05.012
|View full text |Cite
|
Sign up to set email alerts
|

SpeCH: A scalable framework for data placement of data-intensive services in geo-distributed clouds

Abstract: The advent of big data analytics and cloud computing technologies has resulted in wide-spread research on the data placement problem. Since data-intensive services require access to multiple datasets within each transaction, traditional schemes of uniformly partitioning the data into distributed nodes, as employed by many popular data stores like HDFS or Cassandra, may cause network congestion thereby affecting system throughput. In this article, we propose a scalable and unified framework for data-intensive s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 23 publications
(13 citation statements)
references
References 29 publications
1
12
0
Order By: Relevance
“…More specifically, any change (small or large) in the system workload would require re-execution of the full pipeline to obtain the placement output. This design decision is in line with almost every existent technique [3]- [5], [16], [17], [43], [49] in the extensive literature on data placement. Thus, making the CDR placement algorithm dynamically adapt to the changes in the system workload is not in the scope of the current work.…”
Section: Combined Data and Replica Placementsupporting
confidence: 68%
See 4 more Smart Citations
“…More specifically, any change (small or large) in the system workload would require re-execution of the full pipeline to obtain the placement output. This design decision is in line with almost every existent technique [3]- [5], [16], [17], [43], [49] in the extensive literature on data placement. Thus, making the CDR placement algorithm dynamically adapt to the changes in the system workload is not in the scope of the current work.…”
Section: Combined Data and Replica Placementsupporting
confidence: 68%
“…On the other hand, publicly available specialized heuristics for hypergraph partitioning [7] enable graceful scaling of the aforementioned methods to large datasets. Moving further, Atrey et al [3], [5] proposed an algorithm based on spectral clustering of hypergraphs, which portrayed quality similar to the algorithms proposed in [43], however, achieved superior efficiency and scalability owing to the use of randomized eigendecomposition techniques for factorizing the hypergraph laplacian.…”
Section: Related Workmentioning
confidence: 98%
See 3 more Smart Citations