This paper examines the issues surrounding efficient execution in heterogeneous grid environments. The performance of a Linux cluster and a parallel supercomputer is initially compared using both benchmarks and an application. With an understanding of how benchmark and application performance is affected by processor and interconnect speed, a comparison is made with the bandwidth and latencies available in a grid testbed. Of significant concern is the fact that the available communication bandwidth and latencies have a dynamic range of 3 to 4 orders of magnitude while processor speeds have a range of about one half order of magnitude. Also, while both processor speed and network bandwidth are increasing very rapidly, simple propagation delay will become more significant in the network latencies seen by many grid applications. That is to say, the pipes in a grid will be getting fatter but not commensurately shorter. How are we to effectively utilize such an infrastructure? Clearly an attractive approach is to require sufficient concurrency in the application such that a coarsegrain, data-driven model of execution can be used to hide latencies while hopefully keeping context switching overheads low. If the "spatial component" of an application is understood, then runtime systems could also apply established techniques like caching, compression, estimation and speculative pre-fetching. Ideally this low-level performance management should be encapsulated in an easy-to-use abstraction.