Jason M. Fung scite author profile

This paper presents the concept of dynamic control independence (DCl) and shows how it can be detected and exploited in an out-of-order superscalar processor to reduce the performance penalties of branch mispredictions. We show how DCI can be leveraged during branch misprediction recovery to reduce the number of instructions squashed on a misprediction as well as how it can be used to avoid predicting unpredictable branches by fetching instructions out-of-order A realistic implementation is described and evaluated using six SPECint95 benchmarks. We show that exploiting DCI during branch misprediction recovety improves pe$ormance by 0.9-9.9% on a I-wide processol; by I&11.2% on an b-wide processor and by 1.9-15.3% on a 12-wideprocessol: We also show that using DCI information to fetch instructions out-of-order when an unpredictable branch is encountered potentially improves performance by 0.9-15.2% on a I-wide processol: by 2.0-14.8% on an 8-wide processor and by 2.6-16.2% on a 12-wide processor: Some of the largest performance gains are observed on go and gee, which have traditionally posed the most d@cult challenge to aggressive branch prediction techniques.

show abstract

Verifying Information Flow Properties of Firmware using Symbolic Execution

Subramanyan

Malik

Khattri

et al. 2016

View full text Add to dashboard Cite

Architectural Characterization of Processor Affinity in Network Processing

Foong¹,

Fung²,

Newell³

et al. 2005

View full text Add to dashboard Cite

Network protocol stacks, in particular TCP/IP software implementations, are known for its inability to scale well in general-purpose monolithic operating systems (OS) for SMP. Previous researchers have experimented with affinitizing processes/thread, as well as interrupts from devices, to specific processors in a SMP system.However, general purpose operating systems have minimal consideration of userdefined affinity in their schedulers. Our goal is to expose the full potential of affinity by in-depth characterization of the reasons behind performance gains. We conducted an experimental study of TCP performance under various affinity modes on IA-based servers. Results showed that interrupt affinity alone provided a throughput gain of up to 25%, and combined thread/process and interrupt affinity can achieve gains of 30%. In particular, calling out the impact of affinity on machine clears (in addition to cache misses) is characterization that has not been done before.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jason M. Fung

An in-depth analysis of the impact of processor affinity on network performance

Reducing branch misprediction penalties via dynamic control independence detection

Verifying Information Flow Properties of Firmware using Symbolic Execution

Architectural Characterization of Processor Affinity in Network Processing

Contact Info

Product

Resources

About