Much effort and concentration has been put into devising training regimes for a number of different technologies in distributed and high-performance computing (Jandric, Artacho, Hopkins, & Fergusson, 2008;. On the whole, however, these have tended to concentrate on the computational aspects of research tasks rather than the data-related aspects. There have been a number of reasons for this including the immaturity and extra complexity of the data field, the more discipline-specific aspects of data usage compared to computational patterns, and the focus of providers on the "easier" problem of providing distributed computation resources .Data are however fundamental to research activities, and nearly all computational tasks (outside pure simulations) involve some form of transformation of a dataset. Having recognized this, we must therefore ask ourselves what type of training support is required by researchers. Do researchers already understand their datasets and how to manipulate them sufficiently? It may seem impertinent to answer "no" to this latter question but many-to generalize, all-research domains have realized in the last decade or so that the changes in automation and instrumentation have meant that the ability to acquire data has greatly outstripped the ability to process (analyze, manipulate, store, and archive) them. This trend has been particularly recognized within the physics and biology domains, largely due to the advances in data acquisition within these domains.Thus we can say that there is clearly a need, articulated by the user communities themselves, for both data focused solutions and the concomitant training support.