As the number of services and the size of data involved in workflows increases, centralised orchestration techniques are reaching the limits of scalability. In the classic orchestration model, all data pass through a centralised engine, which results in unnecessary data transfer, wasted bandwidth and the engine to become a bottleneck to the execution of a workflow. Choreography techniques, although more complex to model offer a decentralised alternative and are the optimal architecture for data-centric workflows; data are passed directly to where they are required, at the next service in the workflow.While orchestration is the dominant architectural approach, there are relatively few choreography languages and even fewer concrete implementations. This papers contributions are twofold. Firstly we argue the case for choreography in data-intensive computing, and demonstrate through workflow patterns the advantages in terms of scalability when a choreography architecture is adopted. Secondly we introduce the Light Weight Coordination Calculus (LCC), a type of process calculus used to formally define choreographies, and the OpenKnowledge framework, a choreography-based architecture, providing the functionality for peers to coordinate in an open peer-to-peer system. Through LCC and the OpenKnowledge framework we practically demonstrate how choreography can be achieved in a lightweight manner with a comparatively simple process language.