Communication time prediction is critical for parallel application performance tuning, especially for the rapidly growing field of data-intensive applications. However, making such predictions accurately is nontrivial when contention exists on different components in hierarchical networks. In this article, we derive an 'asymmetric network property' on transmission control protocol (TCP) layer for concurrent bidirectional communications in a commercial off-the-shelf (COTS) cluster and develop a communication model as the first effort to characterize the communication times on hierarchical Ethernet networks with contentions on both network interface card and backbone cable levels. We develop a micro-benchmark for a set of simultaneous point-to-point message-passing interface (MPI) operations on a parametrized network topology and use it to validate our model extensively and show that the model can be used to predict the communication times for simultaneous MPI operations (both point-to-point and collective communications) on resourceconstrained networks effectively. We show that if the asymmetric network property is excluded from the model, the communication time predictions will be significantly less accurate than those made by using the asymmetric network property. In addition, we validate the model on a cluster of Grid5000 infrastructure, which is a more loosely coupled platform. As such, we advocate the potential to integrate this model in performance analysis for data-intensive parallel applications. Our observation of the performance degradation caused by the asymmetric network property suggests that some part of the software stack below TCP layer in COTS clusters needs targeted tuning, which has not yet attracted any attention in literature. . 1576 J. ZHU ET AL.archy and resource sharing, make communication time prediction non-trivial and challenging for high-performance clusters.On the other hand, such predictions are needed more now than ever because of the increasing importance of data-intensive applications [4,5] that devote a significant amount of their total execution time in parallel processing to I/O or network communication, instead of computation. A good usable performance analysis of such data-intensive applications requires that the communication model reflects the network properties accurately on state-of-the-art network topologies and technologies.In this article, we consider Ethernet-based network because, compared with custom interconnects (e.g., InfiniBand and Myrinet), it offers widespread compatibility, better cost-performance tradeoff, and a superior road map to 100-Gb standard [6,7]. As of June 2011, 1 or 10 Gb Ethernet has been used as the communication infrastructure in over 45% of the top 500 supercomputers [8]. We use message-passing interface (MPI) as the programming model, which has become the de facto standard for application layer communication on distributed memory systems. On the basis of transmission control protocol (TCP) messaging protocol, MPI over 1 Gb Ethernet has shown c...