The execution of numerically intensive programs presents a challenge to memory system designers. Numerical program execution can be accelerated by pipelined arithmetic units, but to be effective, must be supported by high speed memory access. A cache memory is a well known hardware mechanism used to reduce the average memory access latency. Numerical programs, however, often have poor cache performance. Stride directed prefetching has been proposed to improve the cache performance of numerical programs executing on a vector processor, This paper shows how this approach can be extended to a scalar processor by using a simple hardware mechanism, called a stride prediction table (SPT), to calculate the stride distances of array accesses made from within the loop body of a program. The results using selected programs from the PERFECT and SPEC benchmarks show that stride directed prefetching on a scalar processor can significantly reduce the cache miss rate of particular programs and a SPT need only a small number of entries to be effective.the cache miss ratio for the scalar execution of the matrix multiply for matrix sizes of 100 x 100. For comparison purposes the corresponding vector execution is also shown. The results were obtained using trace driven simulation of 2 4 Kbyte cache with block sizes of 8, 16,32 and 64 bytes. The traces are from executions on an Alliant FX/80. Each trace is for single processor execution where the scalar and vector versions are generated using compiler optimizations. Two miss ratios are shown for each execution; ALL means that all memory data references am simulated and MATRIX means that only references to matrix data (data size of 8 bytes) are simulated. There are 19 and 2.2 million references for scalar and vector executions respectively but only 4 and 2 million of these references are to matrix data. Note that the vector miss ratios are computed relative to the number of vector accesses and not the number of vector referencing instructions. For example, a vector instruction may load 32 elements but this is counted as 32 vector accesses.
This paper presents two new algorithms, Redundant Vector Elimination (RVE) and Essential Fault Reduction (EFR), for generating compact test sets for combinational circuits under the single stuck at fault model, and a new heuristic for estimating the minimum single stuck at fault test set size. These algorithms together with the dynamic compaction algorithm are incorporated into an advanced ATPG system for combinational circuits, called MinTest. MinTest found better lower bounds and generated smaller test sets than the previously published results for the ISCAS85 and full scan version of the ISCAS89 benchmark circuits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.