Amir Roozbeh scite author profile

In modern (Intel) processors, Last Level Cache (LLC) is divided into multiple slices and an undocumented hashing algorithm (aka Complex Addressing) maps different parts of memory address space among these slices to increase the effective memory bandwidth. After a careful study of Intel's Complex Addressing, we introduce a sliceaware memory management scheme, wherein frequently used data can be accessed faster via the LLC. Using our proposed scheme, we show that a key-value store can potentially improve its average performance ∼12.2% and ∼11.4% for 100% & 95% GET workloads, respectively. Furthermore, we propose CacheDirector, a network I/O solution which extends Direct Data I/O (DDIO) and places the packet's header in the slice of the LLC that is closest to the relevant processing core. We implemented CacheDirector as an extension to DPDK and evaluated our proposed solution for latency-critical applications in Network Function Virtualization (NFV) systems. Evaluation results show that CacheDirector makes packet processing faster by reducing tail latencies (90-99 t h percentiles) by up to 119 µs (∼21.5%) for optimized NFV service chains that are running at 100 Gbps. Finally, we analyze the effectiveness of slice-aware memory management to realize cache isolation.

show abstract

PacketMill: toward per-Core 100-Gbps networking

Farshin

Barbette

Roozbeh

et al. 2021

View full text Add to dashboard Cite

We present PacketMill, a system for optimizing software packet processing, which (i) introduces a new model to efficiently manage packet metadata and (ii) employs code-optimization techniques to better utilize commodity hardware. PacketMill grinds the whole packet processing stack, from the high-level network function configuration file to the low-level userspace network (specifically DPDK) drivers, to mitigate inefficiencies and produce a customized binary for a given network function. Our evaluation results show that PacketMill increases throughput (up to 36.4 Gbps ś 70%) & reduces latency (up to 101 µs ś 28%) and enables nontrivial packet processing (e.g., router) at ≈100 Gbps, when new packets arrive > 10× faster than main memory access times, while using only one processing core.

show abstract

Software-Defined “Hardware” Infrastructures: A Survey on Enabling Technologies and Open Research Directions

Roozbeh

Soares

Maguire

et al. 2018

IEEE Commun. Surv. Tutorials

View full text Add to dashboard Cite

show abstract

Techno-economic framework for cloud infrastructure: a cost study of resource disaggregation

Mahloo

Soares

Roozbeh

2017

View full text Add to dashboard Cite

Abstract-The rapid growth of data and high-dependency of industries on using data put lots of focus on the computing facilities. Increasing the efficiency and re-architecting the underlying infrastructure of datacenters, has become a major priority. The total cost of owning and running a datacenter (DC) is affected by many parameters, which until recently were ignored as their impact on the business economy was negligible. However, that is not the case anymore, as in the new era of digital economy every penny counts. The market is too aggressive to ignore anything. Hence, the economic efficiency becomes vital for cloud infrastructure providers despite their size. This article presents a framework to assess cloud infrastructure economic efficiency, taking into account three main aspects: application profiling, hardware dimensioning and total cost of ownership (TCO). Moreover, it presents a cost study of deploying the emerging concept of disaggregated hardware architecture in DCs based on the proposed framework. The study considers all the major cost categories incurred during the DC lifetime in terms of both capital and operational expenditures. A thorough cost comparison between a DC running on a disaggregated hardware architecture with one running on a traditional server-based hardware architecture is presented. The study demonstrates the evolution of the yearly cost over DC lifetime as well as a sensitivity analysis, allowing to understand how to minimize the cost of running cloud. Results show that, lifecycle management cost is one of the main differentiators between two technologies. Moreover, it is shown that in the presence of heterogeneous workloads, having a DC based on a fully disaggregated hardware brings high savings (more than 40% depending on the applications) compared to the traditional hardware architectures independent of the hardware set-up.Index Terms-datacenter cost, disaggregated hardware, total cost of ownership, hardware pools, reconfigurable hardware

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Amir Roozbeh

Make the Most out of Last Level Cache in Intel Processors

PacketMill: toward per-Core 100-Gbps networking

Software-Defined “Hardware” Infrastructures: A Survey on Enabling Technologies and Open Research Directions

Techno-economic framework for cloud infrastructure: a cost study of resource disaggregation

Contact Info

Product

Resources

About