Alexander Giese scite author profile

This article presents the design of a generic HyperTransport (HT) core. HyperTransport is a packetbased interconnect technology for low-latency, high-bandwidth point-to-point connections. It is specially optimized to achieve a very low latency. The core has been verified in system using an FPGA. This exhaustive verification and the generic design allow the mapping to both ASICs and FPGAs. The implementation described in this work supports a 16-bit link width, as used by Opteron processors. On a Xilinx Virtex-4 FX60, the core supports a link frequency of 400 MHz DDR and offers a maximum bidirectional bandwidth of 3.2GB/s. The in-system verification has been performed using a custom FPGA board that has been plugged into a HyperTransport extension connector (HTX) of a standard Opteron-based motherboard. HTX slots in Opteron-based motherboards allow very high-bandwidth, low-latency communication, since the HTX device is directly connected to one of the HyperTransport links of the processor. Performance analysis shows a unidirectional payload bandwidth of 1.4GB/s and a read latency of 180 ns. The HT core in combination with the HTX board is an ideal base for prototyping systems and implementing FPGA coprocessors. The HT core is available as open source.

show abstract

A versatile, low latency HyperTransport core

Slogsnat

Giese

Brüning

2007

View full text Add to dashboard Cite

Highly scalable barriers for future high-performance computing clusters

Fröning

Giese

Montaner

et al. 2011

View full text Add to dashboard Cite

Although large scale high performance computing today typically relies on message passing, shared memory can offer significant advantages, as the overhead associated with MPI is completely avoided. In this way, we have developed an FPGA-based Shared Memory Engine that allows to forward memory transactions, like loads and stores, to remote memory locations in large clusters, thus providing a single memory address space. As coherency protocols do not scale with system size we completely avoid a global coherency across the cluster. However, we maintain local coherency domains, thus keeping the cores within one node coherent. In this paper, we show the suitability of our approach by analyzing the performance of barriers, a very common synchronization primitive in parallel programs. Experiments in a real cluster prototype show that our approach allows synchronization among 1024 cores spread over 64 nodes in less than 15us, several times faster than other highly optimized barriers. We show the feasibility of this approach by executing a shared-memory implementation of FFT. Finally, note that this barrier can also be leveraged by MPI applications running on our shared memory architecture for clusters. This ensures the usefulness of this work for applications already written.

show abstract

HyperTransport 3 Core: A Next Generation Host Interface with Extremely High Bandwidth

Kalisch

Giese²,

Litz³

et al. 2009

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alexander Giese

Cost-optimal static super-replication of barrier options: an optimization approach

An open-source HyperTransport core

A versatile, low latency HyperTransport core

Highly scalable barriers for future high-performance computing clusters

HyperTransport 3 Core: A Next Generation Host Interface with Extremely High Bandwidth

Contact Info

Product

Resources

About