Jiachen Xue scite author profile

Jiachen Xue

4Publications

45Citation Statements Received

89Citation Statements Given

How they've been cited

How they cite others

Affiliations

Nvidia (United States), Purdue University West Lafayette, Huawei Technologies (United States)

Publications

Order By: Most citations

Dynamic server provisioning to minimize cost in an IaaS cloud

Hong

Xue

Thottethodi

2011

View full text Add to dashboard Cite

Cloud computing holds the exciting potential of elastically scaling computation to match time-varying demand, thus eliminating the need to provision for peak demand. However, the uncertainty of variable loads necessitate the use of "margins" -servers that must be held active to absorb unpredictable potential load surges -which can be a significant fraction of overall cost. Further, naively switching to an on-demand cloud model can actually degrade "true costs" (server costs that would be incurred even if margin costs disappeared) because of the fundamental economic rule wherein on-demand services/goods cost more compared to "reserved" goods/services where the user bears some commitment (i.e., on-demand customers must pay a premium in exchange for not undertaking the fixed-cost risk that committed customers undertake).This paper addresses the twin challenges of minimizing margin costs and true costs in an Infrastructure-as-a-Service (IaaS) cloud. Our paper makes the following two contributions. To address the problem of margin costs, we make two key observations based on real Web server traces. First, rather than use a fixed margin, we observe that the margin may be load-dependent. For example, the margin required at low loads may be higher than the margin required at high loads. Second, we observe that the "tolerance" -the fraction of time when the response time target may be violated -need not be uniform across all load levels. For example, compared to a case where we satisfy requests within the target response time 95% of the time (for a tolerance of 5%) irrespective of load, one may achieve lower costs by satisfying the response time target 93% of the time at low loads and 97% of the time at high loads, while still achieving an overall 95% satisfaction ratio. We propose ShrinkWrap-opt which is a dynamic programming algorithm that exploits both the above observations to achieve optimal margin cost while achieving the desired (statistical) response time guarantees. To address true costs, we propose commitment straddling -the mixed use of reserved and on-demand machines -to achieve optimal true-cost. Simulations with real Web server load traces (including 3 months of traces from Wikimedia from Summer 2010) using the Amazon EC2 cost model reveal that our techniques save between 13% and 29% (21% on average) in cost while satisfying response-time targets.

show abstract

PreTrans: Reducing TLB CAM-search via page number prediction and speculative pre-translation

Xue

Thottethodi

2013

View full text Add to dashboard Cite

Selective commitment and selective margin: Techniques to minimize cost in an IaaS cloud

Hong

Xue

Thottethodi

2012

View full text Add to dashboard Cite

Dart: Divide and Specialize for Fast Response to Congestion in RDMA-Based Datacenter Networks

Xue

Chaudhry

Vamanan

et al. 2020

IEEE/ACM Trans. Networking

View full text Add to dashboard Cite

Though Remote Direct Memory Access (RDMA) promises to reduce datacenter network latencies significantly compared to TCP (e.g., 10x), end-to-end congestion control in the presence of incasts is a challenge. Targeting the full generality of the congestion problem, previous schemes rely on slow, iterative convergence to the appropriate sending rates (e.g., TIMELY takes 50 RTTs). Several papers have shown that even in oversubscribed datacenter networks most congestion occurs at the receiver. Accordingly, we propose a divide-and-specialize approach, called Dart, which isolates the common case of receiver congestion and further sub-divides the remaining in-network congestion into the simpler spatially-localized and the harder spatially-dispersed cases. For receiver congestion, we propose direct apportioning of sending rates (DASR) in which a receiver for n senders directs each sender to cut its rate by a factor of n, converging in only one RTT. For the spatially-localized case, Dart provides fast (under one RTT) response by adding novel switch hardware for in-order flow deflection (IOFD) because RDMA disallows packet reordering on which previous load balancing schemes rely. For the uncommon spatially-dispersed case, Dart falls back to DCQCN. Small-scale testbed measurements and at-scale simulations, respectively, show that Dart achieves 60% (2.5x) and 79% (4.8x) lower 99 th -percentile latency, and similar and 58% higher throughput than InfiniBand, and TIMELY and DCQCN.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jiachen Xue

Dynamic server provisioning to minimize cost in an IaaS cloud

PreTrans: Reducing TLB CAM-search via page number prediction and speculative pre-translation

Selective commitment and selective margin: Techniques to minimize cost in an IaaS cloud

Dart: Divide and Specialize for Fast Response to Congestion in RDMA-Based Datacenter Networks

Contact Info

Product

Resources

About