1986
DOI: 10.1145/6497.6500
|View full text |Cite
|
Sign up to set email alerts
|

Iterated interpolation using a systolic array

Abstract: An implementation using systolic array logic of Aitken's method of iterated interpolation is described. The proposed design has a simple, linear topology, requires no clock, and makes only modest demands on the host computer. By overlapping the computation of successive function values, a processing element utilization of approximately 1/2 is achieved. The paper illustrates how “mathematical hardware” packages, as well as software library routines, may be part of the mathematical problem solver's tool kit in t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

1989
1989
2002
2002

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 6 publications
0
5
0
Order By: Relevance
“…Instead of using x~ as input, we use ~ -xi. 2. instead of programming the processors according to recursion (9) and (10), we program them according to recursions (11) and (12).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Instead of using x~ as input, we use ~ -xi. 2. instead of programming the processors according to recursion (9) and (10), we program them according to recursions (11) and (12).…”
Section: Discussionmentioning
confidence: 99%
“…In this section, we present McKeown's array [12]. We embed the process dependence graph for Aitken's algorithm in spacetime.…”
Section: The Mckeown Arraymentioning
confidence: 99%
“…For the purpose of our discussion, we can assume that [ 18. Given that the transmission latency differs from one implementation technology to another and depends also on the channel bandwidth (or width), a wide range of values have been adopted for the parameter l C , 1 [ l C [ 3. We have investigated the effects of the parameters l D , l M and l C on the achieved speedup. Figure 4 shows performance results against the network size where l D =4, 8,12, and 18, and the other parameters are fixed as l C =1 and l M =2. For the completeness, the ideal linear speedup curve (highlighted in bold) has been included.…”
Section: Speedup(n)mentioning
confidence: 99%
“…Note that the entries in a given column can be calculated independently of one another, and they depend only on the entries in the previous column and the x~'s. This gives a straightforward parallel algorithm for the DD's, particularly suitable for systolic implementation (see [4] and [10]), where each column is computed in O(1) time using as many processors as there are entries in that particular column. Since the maximum length of a column is n and there are n columns to calculated, this approach requires 0 (n) parallel arithmetic operations to calculate all the DD's using O (n) processors.…”
Section: Introductionmentioning
confidence: 99%