This paper discusses the design concepts of a lock mechanism for a Parallel Inference Machine (the PIM/c prototype) and investigates the performance of the mechanism in detail. Lock operations are extremely frequent on the PIM; however, lock contention rarely occurs during normal memory usage. For this reason, the lock mechanism is designed so as to minimize the lock overhead time in the case of no contention. This is done by using en invalidation lock mechanism, which utilizes the exclusive state of the snooping cache and in which the locked address is not broadcast.Experimental results demonstrate the benefits of the lock mechanism in regions of few lock contentions.They also confirm that, in most cases, the lock mechanism works well on the PIM. However, the mechanism is also found to cause performance degradation when a locked address is accessed by multiple processing elements (PEs) in a tightly-coupled multi-processor (TCMP). This is because shered data such as the flags for inter-PE communication, which are shared by all the PEs, may be accessed by multiple PEs at the same time, thus generating heavy contention.Thk paper also shows that combining a register-based broadcasting facility with the proposed lock mechanism can solve the above problem.
An architecture called the slice-exchange-point (SEP) has been designed for federating heterogeneous network-virtualization platforms by creating and managing slices (virtual networks). SEP enables whole inter-domain resources to be managed by the network manager of any single domain. Slice-operation commands are propagated to other domains through SEP by using a common API. SEP introduces the following four features: infrastructure neutrality, single interface federation, abstract and clean federation, and extensibility of capabilities. SEP's functions to achieve these features are discussed. SEP was partially implemented on two VNode domains and one ProtoGENI domain and was verified to function effectively.
The characteristics of a cluster-structure parallel computer are analyzed and evaluated on the PIM/c parallel inference machine, which consists of eight-processor shared-memory clusters communicating through a processor connected to a network. To avoid communication bottlenecks, the maximum number of processors in a cluster is limited by the ratio of communication operations to program-execution operations. Since this ratio can be as high as 30% on the PIM/c, the network receiving operations should be distributed to processors in the same cluster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.