Kaveh Aasaraai scite author profile

Soft processors often use data caches to reduce the gap between processor and main memory speeds. To achieve high efficiency, simple, blocking caches are used. Such caches are not appropriate for processor designs such as Runahead and out-of-order execution that require nonblocking caches to tolerate main memory latencies. Instead, these processors use non-blocking caches to extract memory level parallelism and improve performance. However, conventional non-blocking cache designs are expensive and slow on FPGAs as they use content-addressable memories (CAMs). This work proposes NCOR, an FPGA-friendly non-blocking cache that exploits the key properties of Runahead execution. NCOR does not require CAMs and utilizes smart cache controllers. A 4 KB NCOR operates at 329 MHz on Stratix III FPGAs while it uses only 270 logic elements. A 32 KB NCOR operates at 278 Mhz and uses 269 logic elements.

show abstract

Toward virtualizing branch direction prediction

Sadooghi-Alvandi

Aasaraai

Moshovos

2012

View full text Add to dashboard Cite

This work introduces a new branch predictor design that increases the perceived predictor capacity without increasing its delay by using a large virtual second-level table allocated in the second-level caches. Virtualization is applied to a state-of-theart multi-table branch predictor. We evaluate the design using instruction count as proxy for timing on a set of commercial workloads. For a predictor whose size is determined by access delay constraints, accuracy can be improved by 8.7%. Alternatively, the design can be used to achieve the same accuracy as a non-virtualized design while using 25% less dedicated storage.

show abstract

Low-cost, high-performance branch predictors for soft processors

Aasaraai

Moshovos

2013

View full text Add to dashboard Cite

An Efficient Non-blocking Data Cache for Soft Processors

Aasaraai

Moshovos

2010

View full text Add to dashboard Cite

Abstract-Soft processors often use data caches to reduce the gap between processor and main memory speeds. To achieve high efficiency, simple, blocking caches are used. Such caches are not appropriate for processor designs such as runahead and out-of-order execution that require non-blocking caches to tolerate main memory latencies. Conventional nonblocking caches are expensive and slow on FPGAs as they use content-addressable memories (CAMs). This work exploits key properties of runahead execution and demonstrates an FPGA-friendly non-blocking cache design that does not require CAMs. A non-blocking 4KB cache operates at 329MHz on Stratix III FPGAs while it uses only 270 logic elements. A 32KB non-blocking cache operates at 278Mhz and uses 269 logic elements.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kaveh Aasaraai

Towards a viable out-of-order soft core: Copy-Free, checkpointed register renaming

NCOR: An FPGA-Friendly Nonblocking Data Cache for Soft Processors with Runahead Execution

Toward virtualizing branch direction prediction

Low-cost, high-performance branch predictors for soft processors

An Efficient Non-blocking Data Cache for Soft Processors

Contact Info

Product

Resources

About