SUMMARY
Weaver is a high‐level distributed computing framework that enables researchers to construct scalable scientific data‐processing workflows. Instead of developing a new workflow language, we introduce a domain‐specific language built on top of Python called Weaver, which takes advantage of users' familiarity with the programming language, minimizes barriers to adoption, and allows for integration with a rich ecosystem of existing software. In this paper, we provide an overview of Weaver's programming model, which allows users to organize and specify scientific workflows by using a collection of datasets, functions, and abstractions. We also explain how these workflow specifications are compiled into a directed acyclic graph that is used by the Makeflow workflow manager to dispatch work to a variety of distributed execution platforms. To demonstrate the power and benefits of using the framework in constructing scientific research applications, the paper examines four distinct real‐world applications scripted using Weaver and analyzes the performance, scalability, and impact of the distributed generated scientific workflows. Copyright © 2011 John Wiley & Sons, Ltd.