There has been a vast amount of work to develop programming models that provide good performance across machine architectures, are easy to use, and have predictable performance. Similarly, the design and optimization of architectures to achieve optimal performance for an application class remains a challenging task. Accurate cost modeling is essential for both application development and system design.Many scientific computing codes are developed by using libraries that provide custom-built collective communication primitives. For example, the family of Bulk Synchronous Parallel (BSP) machine models provides suitable tools for analyzing such problems. However, modeling the effect of bandwidth limitations for globally unbalanced communication and estimating the hierarchical bandwidth used by applications remain key challenges.We present a hierarchical bandwidth machine model (alphaDBSP) that naturally extends the Decomposable BSP (DBSP) model by associating a bandwidth growth factor alpha to each message pattern. Algorithms executed on alphaDBSP have a runtime that is at least as good as DBSP. Hence, there are globally unbalanced problems for which alphaDBSP analysis is simpler or more accurate 1 . We present three scientific computing kernels that illustrate the differences between alphaDBSP and DBSP analysis.Similar to the BSP family models, alphaDBSP predicts collective communication execution time for a given machine. Additionally, alphaDBSP estimates the hierarchical bandwidth required by a given application. System architects may use this estimation to design machines that avoid bandwidth bottlenecks for their target application class.