The completion of the human genome project and the development of high-throughput approaches herald a dramatic acceleration in the pace of biological research. One of the most compelling next steps will be learning the functional roles of all proteins. Achievement of this goal depends in part on the rapid expression and isolation of proteins at large scale. We exploited recombinational cloning to facilitate the development of methods for the high-throughput purification of human proteins. cDNAs were introduced into a master vector from which they could be rapidly transferred into a variety of protein expression vectors for further analysis. A test set of 32 sequenceverified human cDNAs of various sizes and activities was moved into four different expression vectors encoding different affinity-purification tags. By means of an automatable 2-hr protein purification procedure, all 128 proteins were purified and subsequently characterized for yield, purity, and steps at which losses occurred. Under denaturing conditions when the His 6 tag was used, 84% of samples were purified. Under nondenaturing conditions, both the glutathione S-transferase and maltose-binding protein tags were successful in 81% of samples. The developed methods were applied to a larger set of 336 randomly selected cDNAs. Sixty percent of these proteins were successfully purified under denaturing conditions and 82% of these under nondenaturing conditions. A relational database, FLEXProt, was built to compare properties of proteins that were successfully purified and proteins that were not. We observed that some domains in the Pfam database were found almost exclusively in proteins that were successfully purified and thus may have predictive character.W ith the application of large-scale and high-throughput (HT) approaches to biological and medical questions, biology has embraced a new era of technology development and information collection. The great task lying ahead is to elucidate the functions of all proteins encoded in the genomes of sequenced model organisms. This process involves collection of information about the temporal, spatial, and physiological regulation of proteins, their interaction partners, biochemical activities, posttranslational modifications, and the mutual influence of all these parameters on the physiology of the organism. Over the past several decades, biologists and biochemists have amassed a large collection of powerful tools for the study of individual proteins. However, compared with the study of nucleic acids, the HT study of proteins is still in its infancy. The next great challenge in biology will be to adapt these tools and develop new ones that enable the simultaneous and parallel study of thousands of proteins.The elucidation of biochemical activity and protein-protein interactions are central aspects of understanding protein function. Protein microarrays provide one platform for biochemical experiments to be carried out at extraordinary pace (1-3). However, this exciting technology calls attention to the quest...