Intensive, Asynchronous CollaborationAs the flood of data associated with leading edge computational science continues to escalate, the challenge of supporting the distributed collaborations that are now characteristic of it becomes increasingly daunting. The chief obstacles to progress on this front lie less in the synchronous elements of collaboration, which have been reasonably well addressed by new global high performance networks, than in the asynchronous elements, where appropriate shared storage infrastructure seems to be lacking. The recent report from the Department of Energy on the emerging "data management challenge" [1] captures the multidimensional nature of this problem succinctly:"Data inevitably needs to be buffered, for periods ranging from seconds to weeks, in order to be controlled as it moves through the distributed and collaborative research process. To meet the diverse and changing set of application needs that different research communities have, large amounts of non-archival storage are required for transitory buffering, and it needs to be widely dispersed, easily available, and configured to maximize flexibility of use. In today's grid fabric, however, massive storage is mostly concentrated in data centers, available only to those with user accounts and membership in the appropriate virtual organizations, allocated as if its usage were non-transitory, and encapsulated behind legacy interfaces that inhibit the flexibility of use and scheduling. This situation severely restricts the ability of application communities to access and schedule usable storage where and when they need to in order to make their workflow more productive." (p.69f) One possible strategy to deal with this problem lies in creating a storage infrastructure that can be universally shared because it provides only the most generic of asynchronous services. Different user communities then define higher level services as necessary to meet their needs. One model of such a service is a Storage Network, analogous to those used within computation centers, but designed to operate on a global scale. Building on a basic storage service that is as primitive as possible, such a Global Storage Network would define a framework within which higher level services can be created. If this framework enabled a variety of more specialized middleware and supported a wide array of applications, then interoperability and collaboration could occur based on that common framework.The research in Logistical Networking (LN) carried out under the DOE's SciDAC program tested the value of this approach within the context of several SciDAC application communities. Below we briefly describe the basic design of the LN storage network and some of the results that the Logistical Networking community has achieved.
Infrastructure and Fundamental ServicesThe core services of any Global Storage Network are the allocation of persistent buffers and transfer of data between such buffers. Using these operations, a wide range of operations can be implemented, includ...