FA 18.4: a phase-tolerant 3.8 GB/s data-communication router for a multiprocessor supercomputer backplane

Reese, E.; Wilson, H.; Nedwek, D.; Jex, J.; Khaira, Manpreet; Burton, T.; Nag, P.V. Sunil; Kumar, Harish; Dike, C.; Finan, D.; Haycock, M.

doi:10.1109/isscc.1994.344634

Cited by 14 publications

(4 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The data bus can operate in a doublepumped (source synchronous 3,4 ) transfer mode. That is, data transfers occur twice for every bus clock cycle (see the example in Figure 5).…”

Section: System Bus Overviewmentioning

confidence: 99%

The IA-64 Itanium processor cartridge

2001

View full text Add to dashboard Cite

“…The data bus can operate in a doublepumped (source synchronous 3,4 ) transfer mode. That is, data transfers occur twice for every bus clock cycle (see the example in Figure 5).…”

Section: System Bus Overviewmentioning

confidence: 99%

The IA-64 Itanium processor cartridge

2001

View full text Add to dashboard Cite

“…Any time a message enters the network, it is charged a fixed network transit latency. This latency is based on the average transit time for a two-dimensional mesh network having a per-hop fallthrough time of 40 ns [Intel94]. For our 16-processor simulations, the average message requires latency equivalent to one hop to both enter and exit the network, 2.6 hops of network transit, and 3 cycles of network header information, yielding an average transit time of 220 ns, or 22 cycles.…”

Section: Common Characteristicsmentioning

confidence: 99%

The performance impact of flexibility in the Stanford FLASH multiprocessor

et al. 1994

View full text Add to dashboard Cite

A flexible communication mechanism is a desirable feature in multiprocessors because it allows support for multiple communication protocols, expands performance monitoring capabilities, and leads to a simpler design and debug process. In the Stanford FLASH multiprocessor, flexibility is obtained by requiring all transactions in a node to pass through a programmable node controller, called MAGIC. In this paper, we evaluate the performance costs of flexibility by comparing the performance of FLASH to that of an idealized hardwired machine on representative parallel applications and a multiprogramming workload. To measure the performance of FLASH, we use a detailed simulator of the FLASH and MAGIC designs, together with the code sequences that implement the cache-coherence protocol. We find that for a range of optimized parallel applications the performance differences between the idealized machine and FLASH are small. For these programs, either the miss rates are small or the latency of the programmable protocol can be hidden behind the memory access time. For applications that incur a large number of remote misses or exhibit substantial hot-spotting, performance is poor for both machines, though the increased remote access latencies or the occupancy of MAGIC lead to lower performance for the flexible design. In most cases, however, FLASH is only 2%-12% slower than the idealized machine.

show abstract

“…In chip-to-chip communication, increasing the bandwidth per wire enhances system performance due to limited number of pins [1] [2]. Simultaneous Bidirectional (SBD) signaling was previously introduced to allow simultaneous data transmission in two directions over one wire, doubling the effective bandwidth per wire over a point-to-point unidirectional transmission [3].…”

Section: Introductionmentioning

confidence: 99%

A 8-Gb/s/pin current mode multi-level simultaneous bidirectional I/O

Kim

Kang

2008

2008 IEEE International Symposium on Circuits and Systems

View full text Add to dashboard Cite

This paper describes a high speed current mode multi-level simultaneous bi-directional I/O. To increase data rate, impedance matching, reference calibration, latched differential current switching and clocked comparators are used. Simulation results based on 0.18µm CMOS process show that the proposed design achieves data rate up to 8-Gb/s/pin at the power consumption of 46.8mW with 1.8V power supply.

show abstract

FA 18.4: a phase-tolerant 3.8 GB/s data-communication router for a multiprocessor supercomputer backplane

Cited by 14 publications

References 5 publications

The IA-64 Itanium processor cartridge

The IA-64 Itanium processor cartridge

The performance impact of flexibility in the Stanford FLASH multiprocessor

A 8-Gb/s/pin current mode multi-level simultaneous bidirectional I/O

Contact Info

Product

Resources

About