Widening resources: a cost-effective technique for aggressive ILP architectures

Lopez, D.; Llosa, J.; Valero, Mateo; Ayguade, E.

doi:10.1109/micro.1998.742785

Cited by 10 publications

(12 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In parallel, the range of addresses held in this cache line are compared with the addresses of pending loads, and all loads that access to the same line are served from the single access (in our approach, only 4 pending loads can be served at the same cycle). This organization has been previously proposed elsewhere [11,12,22,23].…”

Section: Wide Busmentioning

confidence: 67%

See 1 more Smart Citation

Speculative dynamic vectorization

Pajuelo

González

Valero

2002

SIGARCH Comput. Archit. News

View full text Add to dashboard Cite

show abstract

Section: Wide Busmentioning

confidence: 67%

“…Lopez et al [11] propose and evaluate aggressive wide VLIW architectures oriented to numerical applications. The main idea is to take advantage on the existence of stride one in numerical and multimedia loops.…”

Section: Related Workmentioning

confidence: 99%

Speculative dynamic vectorization

Pajuelo

González

Valero

2002

SIGARCH Comput. Archit. News

View full text Add to dashboard Cite

show abstract

“…On one side, resource replication consists on increasing the number of functional units available in the processor. On the other side, resource widening [13] consists on increasing the number of operations that each functional unit can simultaneously perform per cycle (i.e. functional units that operate with short vectors).…”

Section: Resource-bound Loopsmentioning

confidence: 99%

“…Although replication enables the exploitation of more ILP than widening, its larger costs (in terms of area and cycle time) precludes the use of high degrees of replication in favour of a combination of small degrees of replication and widening. A detailed performance/cost analysis of different future processor configurations based on a combination of replication and widening can be found elsewhere [13].…”

Section: Resource-bound Loopsmentioning

confidence: 99%

“…The loops performance can be augmented by increasing the number of functional units (replication technique), by exploiting data parallelism at the functional unit level (like in vector processors [27] or, in superscalar and VLIW processors, using the widening technique [13][14] [20]), or by using functional units that can perform multiple operations as a monolithic operation (e.g. fused multiply and add FMA floating-point units perform a multiplication and a dependent addition as a single operation).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation