The Warp Computer: Architecture, Implementation, and Performance

Annaratone, Marco; Arnould, E.; Groß, Thomas; Kung, H. T.; Lam, Monica S.; Menzilcioglu, O.; Webb, Jon A.

doi:10.1109/tc.1987.5009502

Cited by 278 publications

(61 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In the past, pipelined computation was the basis for systolic arrays [16,3] and vector processors Cray-1 [26]. More recently, stream processing has been a topic of research in both the industry and the academia.…”

Section: Related Researchmentioning

confidence: 99%

Architectural Considerations for Efficient Software Execution on Parallel Microprocessors

Vadlamani

Jenks

2007

2007 IEEE International Parallel and Distributed Processing Symposium

View full text Add to dashboard Cite

Chip Multiprocessors (CMPs) and Simultaneous Multithreading (SMT) processors provide high performance but put more pressure on the memory interface than their single-thread counterparts. The "memory wall" problem is exacerbated by multiple threads sharing a memory interface, and will get worse as more cores are added. Therefore, communications between cores, using shared caches or fast interconnects between private caches, are needed to keep the CPUs busy without burdening the memory interface. Multiple CMP systems add another dimension to this challenging problem, as the communication mechanism is no longer uniform. To parallelize data-intensive applications for high performance on these systems, one must explore a number of execution behaviors in a complex architecturedependent exercise that entails identifying key components of the communication subsystem and understanding their behavior under varying workloads. As part of ongoing research into efficient program execution models for parallel microprocessors, we have developed a tool to evaluate the performance of the storage controllers at different levels of the memory hierarchy under varying workloads and measure cache coherence overhead. The tool allows exploration of architectural features of real processors that affect the performance of several parallel execution approaches. Here, we demonstrate its use by evaluating two of our parallel programming models that employ architecture-specific optimizations and compare them to a conventional model for several applications on parallel microprocessors.

show abstract

Section: Related Researchmentioning

confidence: 99%

Architectural Considerations for Efficient Software Execution on Parallel Microprocessors

Vadlamani

Jenks

2007

2007 IEEE International Parallel and Distributed Processing Symposium

View full text Add to dashboard Cite

show abstract

“…We describe the machine only briefly here; details are available from a separate papa- [3]. The Warp array performs the bulk of the computation.…”

Section: Warpmentioning

confidence: 99%

Computational models for parallel computers

Kung

1988

Phil. Trans. R. Soc. Lond. A

View full text Add to dashboard Cite

Computational models define the usage patterns of a computer. They can be used to derive the architecture of the machine, provide guidelines for programming tools, and suggest how the machine should be used in applications. Identifying computational models is especially important for parallel computers, since their architectures and usages are still not well understood in general. This paper describes a number of computational models for parallel computers. These models characterize the communication patterns under which processors exchange their intermediate results during computation. Emphases are placed upon models for one-dimensional processor arrays, reflecting Carnegie Mellon's experiences with the Warp systolic array machine. These models include local computation, domain partition, pipeline, multifunction pipeline and ring.

show abstract

“…Using these values, it is straightforward to trace a shortest path to the source from z ?y other position. That is, the next position on the path is a neighboring position whose value leads to the value of th ; current position using equation (1) or (2). This procedure continues until the source is reached.…”

Section: Obtaining a Shortest Pathmentioning

confidence: 99%

“…Suppose that a cell takes a unit time to perform the computation corresponding to equation (1) or (2). Then for both the mapping methods, the execution time for either the red or blue sweep is:…”

Section: 3 Performance Analysismentioning

confidence: 99%

See 1 more Smart Citation

Path planning on the warp computer: using a linear systolic array in dynamic programming^∗

Bitz¹,

Kung²

1988

International Journal of Computer Mathematics

View full text Add to dashboard Cite

Given a map in which each position is associated with a travcrsability cost, the path planning problem is to find a minimum-cost path from a source position to every other position in the map. The paper proposes a dynamic programming algorithm to solve the problem, and analyzes the exact number of operations that the algorithm takes. The algorithm accesses the map in a highly regular way, so it is suitable for parallel implementation. The paper describes two general methods of mapping the dynamic programming algorithm onto the linear systolic array in the Warp machine developed by Carnegie Mellon. Both methods have led to efficient implementations on Warp. It is concluded that a linear systolic array of powerful cells like the one in Warp is effective in implementing the dynamic programming algorithm for solving the path planning problem.

show abstract

The Warp Computer: Architecture, Implementation, and Performance

Cited by 278 publications

References 26 publications

Architectural Considerations for Efficient Software Execution on Parallel Microprocessors

Architectural Considerations for Efficient Software Execution on Parallel Microprocessors

Computational models for parallel computers

Path planning on the warp computer: using a linear systolic array in dynamic programming^∗

Contact Info

Product

Resources

About

The Warp Computer: Architecture, Implementation, and Performance

Cited by 278 publications

References 26 publications

Architectural Considerations for Efficient Software Execution on Parallel Microprocessors

Architectural Considerations for Efficient Software Execution on Parallel Microprocessors

Computational models for parallel computers

Path planning on the warp computer: using a linear systolic array in dynamic programming∗

Contact Info

Product

Resources

About

Path planning on the warp computer: using a linear systolic array in dynamic programming^∗