Abstract. In recent years, a number of infrastructures have been proposed for the collection and distribution of medical data for research purposes. The design of such infrastructures is challenging: on the one hand, they should link patient data collected from different hospitals; on the other hand, they can only use anonymised data because of privacy regulations. In addition, they should allow data depseudonymisation in case research results provide information relevant for patients' health. The privacy analysis of such infrastructures can be seen as a problem of data minimisation. In this work, we introduce coalition graphs, a graphical representation of knowledge of personal information to study data minimisation. We show how this representation allows identification of privacy issues in existing infrastructures. To validate our approach, we use coalition graphs to formally analyse data minimisation in two (de)-pseudonymisation infrastructures proposed by the Parelsnoer initiative.