In this work we describe pyEHR, a new toolkit for building scalable clinical/phenotypic data management systems for biomedical research applications. The toolkit uses openEHR formalisms to guarantee the decoupling of clinical data descriptions from implementation details, and NoSQL technologies, or next-generation SQL ones, to provide scalable storage back-ends.