In this paper, we propose to use optical NOCs to design cache access protocols for large shared L2 caches. We observe that the problem is unique because optical networks have very low latency, and in principle all the cache banks are very close to each other. A naive approach is to broadcast a request to a set of banks that might possibly contain the copy of a block. However, this approach is wasteful in terms of energy and bandwidth. Hence, we propose a novel scheme in this paper, TSI , which proposes to create a set of virtual networks (overlays) of cache banks over a physical optical NOC. We search for a block inside each overlay using a combination of multicast, and unicast messages. We additionally create support for our overlay networks by proposing optimizations to the previously proposed R-SWMR network. We also propose a set of novel hardware structures for creating and managing overlays, and for efficiently locating blocks in the overlay. The performance of the TSI scheme is within 2-3% of a broadcast scheme, and it is faster than traditional static NUCA schemes by 50%. As compared to the broadcast scheme it reduces the number of accesses, and consequently the dynamic energy by 20-30%.
In this article, we propose using optical networks-on-chip (NoCs) to design cache access protocols for large shared L2 caches. We observe that the problem is unique because optical networks have very low latency, and in principle all of the cache banks are very close to each other. A naive approach is to broadcast a request to a set of banks that might possibly contain the copy of a block. However, this approach is wasteful in terms of energy and bandwidth. Hence, we propose a set of novel schemes that create a set of virtual networks ( overlays ) of cache banks over a physical optical NoC. We search for a block inside each overlay using a combination of multicast and unicast messages. We first propose two simple protocols: TSI and Broadcast . The former uses unicast messages, and the latter uses multicast messages. We subsequently propose an improved scheme, OP_BCAST , that combines the best of TSI and Broadcast , and mainly uses restricted multicast messages. Then we propose a set of novel hardware structures for creating and managing overlays, for efficiently locating blocks in the overlay, and for implementing dynamically changing overlays with OP_BCAST . The performance of the TSI scheme is within 2% to 3% of a broadcast scheme, and it is faster than traditional schemes with electrical networks by 26%. Compared to the broadcast scheme, it reduces the number of accesses, and consequently the dynamic energy of the caches by 6% to 8%. OP_BCAST is 34% faster than the best solutions with copper-based NoCs; moreover, it reduces the dynamic energy for cache access by 33% compared to the TSI scheme.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.