Hsien-Hsin S. Lee scite author profile

The widespread application of deep learning has changed the landscape of computation in the data center. In particular, personalized recommendation for content ranking is now largely accomplished leveraging deep neural networks. However, despite the importance of these models and the amount of compute cycles they consume, relatively little research attention has been devoted to systems for recommendation. To facilitate research and to advance the understanding of these workloads, this paper presents a set of real-world, productionscale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation. In addition to releasing a set of open-source workloads, we conduct indepth analysis that underpins future system design and optimization for at-scale recommendation: Inference latency varies by 60% across three Intel server generations, batching and co-location of inferences can drastically improve latency-bounded throughput, and the diverse composition of recommendation models leads to different optimization strategies.Preprint. Under submission.

show abstract

SAFER: Stuck-At-Fault Error Recovery for Memories

Seong

et al. 2010

View full text Add to dashboard Cite

Adaptive transaction scheduling for transactional memory systems

2008

View full text Add to dashboard Cite

Transactional memory systems are expected to enable parallel programming at lower programming complexity, while delivering improved performance over traditional lock-based systems. Nonetheless, there are certain situations where transactional memory systems could actually perform worse. Transactional memory systems can outperform locks only when the executing workloads contain sufficient parallelism. When the workload lacks inherent parallelism, launching excessive transactions can adversely degrade performance. These situations are likely to become dominant in future workloads when large-scale transactions are frequently executed. In this paper, we propose a new paradigm called adaptive transaction scheduling to address this issue. Based on the parallelism feedback from applications, our adaptive transaction scheduler dynamically dispatches and controls the number of concurrently executing transactions. In our case study, we show that our low-cost mechanism not only guarantees that hardware transactional memory systems perform no worse than a single global lock, but also significantly improves performance for both hardware and software transactional memory systems.

show abstract

DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference

et al. 2020

View full text Add to dashboard Cite

An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth

et al. 2010

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hsien-Hsin S. Lee

The Architectural Implications of Facebook's DNN-Based Personalized Recommendation

SAFER: Stuck-At-Fault Error Recovery for Memories

Adaptive transaction scheduling for transactional memory systems

DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference

An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth

Contact Info

Product

Resources

About