Overlapping computation with communication is a key technique to conceal the effect of communication latency on the performance of parallel applications. MPI is a widely used message passing standard for high performance computing.
One of the most important factors in achieving a good level of overlap is the MPI ability to make progress on outstanding communication operations.In this paper, we address some of the communication progress shortcomings in the current polling and RDMA Read based Rendezvous protocol used for transferring large messages in MPI. We then propose a novel speculative Rendezvous protocol that uses RDMA Read and RDMA Write to effectively improve communication progress and consequently the overlap ability. Performance results based on a modified MPICH2 over 10-Gigabit iWARP Ethernet reveal a significant (80-100%) improvement in receiver side overlap and progress ability.
Power management and energy savings in high performance computing has become an increasingly important design constraint. The foundation of many power/energy sav ing methods is based on power consumption models, which commonly rely on hardware performance monitoring counters (PMCs). Various events are provided by processor manufactur ers to be monitored using PMCs. PMC event selection has been mainly based on architectural intuitions. However, efficient use of PMCs requires a carefully selected set of events. Therefore, a comprehensive study of PMC events with regards to power modeling is needed to understand and enhance such power models.In this paper, we study the relationship of PMC events with power consumption in the context of single-PMC and multi-PMC power models. Our OpenMP applications are from NAS Parallel Benchmark (BT, CG, LU, and SP) running on an AMD machine. We present the single-PMC selection results for each of our test applications, as well as a unified list for all four applications. Unlike other work that do not consider PMCs as each others' covariates, we present a method to select the most correlated set of PMC events for a given application. Our method finds the desired set of events with 6 times less number of executions compared to a principal component analysis (PCA) method. In addition, we have investigated variability of measurement for correlation coefficients. The 95% confidence interval of power-PMC and PMC-PMC correlation coefficients falls within 1.6% and 2.3% of their measured values, respectively. Furthermore, we study the power and PMC trends in the context of time-series and show that power estimates can be enhanced more than common regression methods. We show that the ARMAX model, a time series candidate for real-time power estimation, can estimate system power consumption with a mean absolute error (total signal) of 0.1-0.5 % in our applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.