Delays caused by memory access conjiicts were measured for vector operations on the CRAY Y-MP and CRAY X-MP, and on two versions of the CRAY-2 with static and dynamic memory. The delays were measured as jimctions of vector length and the number of active processors. The observed delays were lowest for memory access operations with stride one. Considerably higher delays were observed for mixed strides or random access. For access operations with mixed strides, measurements indicate that memory access was slowed down by a factor of up to 1.7 for the 8-processor CRAY Y-MP. For the 4-processor machines, the factors were 3.1 for the CRAY X-MP, 4.4 for the CRAY-2 with dynamic memory, and 2.5 for the CRAY-2 with static memory. The results are compared with a queueing model of memory bank conjiicts.A model of vector performance with vector loop unrolling is presented as a special case of a previously published model.
ABSTRA CTProblems related to the evaluation of computational speeds of supercomputers are discussed. Measurements of sequential speeds, vector speeds, and asynchronous parallel processing speeds are presented. A simple model is developed that allows us to evaluate the workload-dependent effective speed of current systems such as vector computers and asynchronous parallel processin# systems. Results indicate that the effective speed of a supercompurer is severely limited by its slowest processing mode unless the fraction of the workload that has to be processed in this mode is negligibly small.
The performance of an interleaved common memory accessed uniformly by multiple processors is modeled by queueing methods. The model includes access conflicts at the bank level while assuming an ideal access network.
A scaling relation is derived that is generally valid and indicates that memory access delays are given by the product of the bank reservation time and a function of the memory utilization, which is given by the average number of access requests arriving at a bank per bank reservation time. For light memory traffic, access delays are proportional to the square of the bank reservation time and the ratio of the number of active memory access streams to the number of memory banks.
Assuming random access patterns, an open and a closed queueing model are developed that are validated by simulations. Delay dependence on bank reservation time is quadratic for light loads and linear for very heavy loads.
A heuristic extension of our model for vector accessing of banks in sequential mode is proposed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.