Modern data centers have grown beyond CPU nodes to provide domain-specific accelerators such as GPUs and FP-GAs to their customers. From a security standpoint, cloud customers want to protect their data. They are willing to pay additional costs for trusted execution environments such as enclaves provided by Intel SGX and AMD SEV. Unfortunately, the customers have to make a critical choice-either use domain-specific accelerators for speed or use CPU-based confidential computing solutions.To bridge this gap, we aim to enable data-center scale confidential computing that expands across CPUs and accelerators. We argue that having wide-scale TEE-support for accelerators presents a technically easier solution, but is far away from being a reality. Instead, our hybrid design provides enclaved execution guarantees for computation distributed over multiple CPU nodes and devices with/without TEE support. Our solution scales gracefully in two dimensions-it can handle a large number of heterogeneous nodes and it can accommodate TEE-enabled devices as and when they are available in the future. We observe marginal overheads of 0.42-8% on real-world AI data center workloads that are independent of the number of nodes in the data center. We add custom TEE support to two accelerators (AI and storage) and integrate it into our solution, thus demonstrating that it can cater to future TEE devices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.