Implementation and Evaluation of On-Chip Network Architectures

Gratz, Paul V.; Kim, Changkyu; McDonald, Robert; Keckler, Stephen W.; Burger, Doug

doi:10.1109/iccd.2006.4380859

Cited by 123 publications

(83 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our earlier work examining the OCN [22] showed that real benchmark generated traffic in on-chip networks would not be modeled well by traditional synthetic loads. We perform a similar analysis for the OPN using network traces generated from tsim-cyc and characterize the network workload.…”

Section: Opn Traffic Trace Analysismentioning

confidence: 99%

Implementation and Evaluation of a Dynamically Routed Processor Operand Network

Gratz

Sankaralingam

Hanson

et al. 2007

First International Symposium on Networks-on-Chip (NOCS'07)

Self Cite

View full text Add to dashboard Cite

Abstract-Microarchitecturally integrated on-chip networks, or micronets, are candidates to replace busses for processor component interconnect in future processor designs. For micronets, tight coupling between processor microarchitecture and network architecture is one of the keys to improving processor performance. This paper presents the design, implementation and evaluation of the TRIPS operand network (OPN). The TRIPS OPN is a 5x5, dynamically routed, 2D mesh micronet that is integrated into the TRIPS microprocessor core. The TRIPS OPN is used for operand passing, register file I/O, and primary memory system I/O. We discuss in detail the OPN design, including the unique features that arise from its integration with the processor core, such as its connection to the execution unit's wakeup pipeline and its in flight mis-speculated traffic removal. We then evaluate the performance of the network under synthetic and realistic loads. Finally, we assess the processor performance implications of OPN design decisions with respect to the end-toend latency of OPN packets and the OPN's bandwidth.

show abstract

Section: Opn Traffic Trace Analysismentioning

confidence: 99%

Implementation and Evaluation of a Dynamically Routed Processor Operand Network

Gratz

Sankaralingam

Hanson

et al. 2007

First International Symposium on Networks-on-Chip (NOCS'07)

Self Cite

View full text Add to dashboard Cite

show abstract

“…The completely bufferless designs either drop [19,16] or misroute (deflect) [38] flits when contention occurs. Eliminating buffers is desirable: buffers draw a significant fraction of NoC power [21] and area [17], and can increase router latency. Moscibroda and Mutlu [38] report 40% network energy reduction with minimal performance impact at low-to-medium network load.…”

Section: Introductionmentioning

confidence: 99%

CHIPPER: A low-complexity bufferless deflection router

Fallin

Craik

Mutlu

2011

2011 IEEE 17th International Symposium on High Performance Computer Architecture

165

222

View full text Add to dashboard Cite

As Chip Multiprocessors (CMPs) scale to tens or hundreds of nodes, the interconnect becomes a significant factor in cost, energy consumption and performance. Recent work has explored many design tradeoffs for networks-on-chip (NoCs) with novel router architectures to reduce hardware cost. In particular, recent work proposes bufferless deflection routing to eliminate router buffers. The high cost of buffers makes this choice potentially appealing, especially for lowto-medium network loads.However, current bufferless designs usually add complexity to control logic. Deflection routing introduces a sequential dependence in port allocation, yielding a slow critical path. Explicit mechanisms are required for livelock freedom due to the non-minimal nature of deflection. Finally, deflection routing can fragment packets, and the reassembly buffers require large worst-case sizing to avoid deadlock, due to the lack of network backpressure. The complexity that arises out of these three problems has discouraged practical adoption of bufferless routing.To counter this, we propose CHIPPER (Cheap-Interconnect Partially Permuting Router), a simplified router microarchitecture that eliminates in-router buffers and the crossbar. We introduce three key insights: first, that deflection routing port allocation maps naturally to a permutation network within the router; second, that livelock freedom requires only an implicit token-passing scheme, eliminating expensive age-based priorities; and finally, that flow control can provide correctness in the absence of network backpressure, avoiding deadlock and allowing cache miss buffers (MSHRs) to be used as reassembly buffers. Using multiprogrammed SPEC CPU2006, server, and desktop application workloads and SPLASH-2 multithreaded workloads, we achieve an average 54.9% network power reduction for 13.6% average performance degradation (multiprogrammed) and 73.4% power reduction for 1.9% slowdown (multithreaded), with minimal degradation and large power savings at low-to-medium load. Finally, we show 36.2% router area reduction relative to buffered routing, with comparable timing.

show abstract

“…In order to fully exploit the increasing number of cores and get enough parallelism for applications, virtualization for multicore chips is becoming necessary [8], [10], [11], [12]. The virtualized NoC solution provides several advantages such as increasing resource utilization, reducing power consumption and increasing the yield of chips [11].…”

Section: Introductionmentioning

confidence: 99%

“…The Regular topology, especially the 2D mesh topology, becomes a kind of popular architecture for NoC design, for it is very simple and efficient from a layout perspective [9], [10]. For the traditional routing algorithms, logic-based routing algorithm (e.g.…”

Section: Introductionmentioning

confidence: 99%

Convex-Based DOR Routing for Virtualization of NoC

Sun¹,

Zhang²,

Li³

et al. 2010

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Network on Chip (NoC) is proposed as a promising intra-chip communication infrastructure. A simple and efficient routing scheme is important for large scale NoC to provide the required communication performance to applications with low area and power overheads. Although mesh is preferred for NoC, virtualization may lead to irregular topologies. In this paper, we propose a Convex-Based DOR (CBDOR) routing scheme for the convex topologies. We demonstrate the connectedness and deadlock-freedom of CBDOR. This routing mechanism relies only on two bits per switch. Simulation results show that the area overhead of CBDOR switch is just 2.2% higher than that of traditional DOR switch, with the added complexity negligible. Therefore, the simplicity in the routing mechanism and switch architecture makes CBDOR more practical and scalable when compared to LBDR and FDOR.

show abstract

Implementation and Evaluation of On-Chip Network Architectures

Cited by 123 publications

References 13 publications

Implementation and Evaluation of a Dynamically Routed Processor Operand Network

Implementation and Evaluation of a Dynamically Routed Processor Operand Network

CHIPPER: A low-complexity bufferless deflection router

Convex-Based DOR Routing for Virtualization of NoC

Contact Info

Product

Resources

About