We propose UNITD, a unified IntroductionShared memory multiprocessors, including multicore processors, have many caches, and these caches must be kept coherent. For caches that hold instructions or data, coherence is almost invariably maintained with an all-hardware cache coherence protocol. Hardware controllers at the caches coordinate amongst themselves-using snooping or directories-to ensure that instructions and data are kept coherent, and this coherence is not software-visible. However, for caches that hold address translations (i.e., translation lookaside buffers), coherence is almost always maintained by an OS-managed software coherence protocol. Even for architectures with hardware control of TLB fills and evictions, when an event occurs that affects the coherence of TLB entries (e.g., eviction of a page of virtual memory), the OS ensures translation coherence through a software routine called TLB shootdown [6].This dichotomy between using hardware for cache coherence 1 and software for TLB coherence inspires two questions. First, why is cache coherence performed in hardware? Second, why is TLB coherence performed in software? Our answers to these questions lead us to conclude that the time is right to move TLB coherence into hardware.We begin by exploring why cache coherence is performed in hardware, and we discover two primary reasons: performance and microarchitectural decoupling. Performance-wise, hardware is far faster than software, and for coherence this performance advantage grows as a function of the number of caches. Although using software for local activities (e.g., TLB fills and replacements) might have acceptable performance, even some architectures that have traditionally relied on software for such operations (e.g., SPARC) are now transitioning to hardware support for increased performance [29]. In contrast, activities with global coordination are painfully slow when performed in software. For example, Laudon [23] mentions that for a page migration on the SGI Origin multiprocessor, the software routine for TLB shootdown is three times more time-consuming than the actual page move. The second reason for performing cache coherence in hardware is to create a high-level architecture that can support a variety of microarchitectures. A less hardware-constrained OS can easily accommodate heterogeneous cores as it does not have to be aware of each core's particularities [22]. Furthermore, hardware coherence enables migrating execution state between cores for performance, thermal, or reliability purposes [10,19] without software knowledge.Given that hardware seems to be an appropriate choice for cache coherence, why has TLB coherence remained architecturally visible and under the control of software? We believe that one reason architects have not explored hardware TLB coherence is that they already have a well-established mechanism that is not too costly for systems with a small number of processors. For previous multiprocessor systems, Black [6] explains that "the low overhead of maint...
Computer systems with virtual memory are susceptible to design bugs and runtime faults in their address translation (AT) systems. Detecting bugs and faults requires a clear specification of correct behavior. To address this need, we develop a framework for ATaware memory consistency models. We expand and divide memory consistency into the physical address memory consistency (PAMC) model that defines the behavior of operations on physical addresses and the virtual address memory consistency (VAMC) model that defines the behavior of operations on virtual addresses. As part of this expansion, we show what AT features are required to bridge the gap between PAMC and VAMC. Based on our AT-aware memory consistency specifications, we design efficient dynamic verification hardware that can detect violations of VAMC and thus detect the effects of design bugs and runtime faults, including most AT related bugs in published errata.
Computer systems with virtual memory are susceptible to design bugs and runtime faults in their address translation (AT) systems. Detecting bugs and faults requires a clear specification of correct behavior. To address this need, we develop a framework for ATaware memory consistency models. We expand and divide memory consistency into the physical address memory consistency (PAMC) model that defines the behavior of operations on physical addresses and the virtual address memory consistency (VAMC) model that defines the behavior of operations on virtual addresses. As part of this expansion, we show what AT features are required to bridge the gap between PAMC and VAMC. Based on our AT-aware memory consistency specifications, we design efficient dynamic verification hardware that can detect violations of VAMC and thus detect the effects of design bugs and runtime faults, including most AT related bugs in published errata.
We develop architectural techniques for mitigating the impact of process variability. Our techniques hide the performance effects of slow components-including registers, functional units, and L1I and L1D cache frames-without slowing the clock frequency or pessimistically assuming that all components are slow. Using ideas previously developed for other purposes-criticality-based allocation of resources, prefetching, and prefetch buffering-we allow design engineers to aggressively set the clock frequency without worrying about the subset of components that cannot meet this frequency. Our techniques outperform speed binning, because clock frequency benefits outweigh slight losses in IPC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.