To integrate heterogeneous and large omics data constitutes not only a conceptual challenge but a practical hurdle in the daily analysis of omics data. With the rise of novel omics technologies and through large-scale consortia projects, biological systems are being further investigated at an unprecedented scale generating heterogeneous and often large data sets. These data-sets encourage researchers to develop novel data integration methodologies. In this introduction we review the definition and characterize current efforts on data integration in the life sciences. We have used a web-survey to assess current research projects on data-integration to tap into the views, needs and challenges as currently perceived by parts of the research community.
Structural biology, homology modelling and rational drug design require accurate three-dimensional macromolecular coordinates. However, the coordinates in the Protein Data Bank (PDB) have not all been obtained using the latest experimental and computational methods. In this study a method is presented for automated re-refinement of existing structure models in the PDB. A large-scale benchmark with 16 807 PDB entries showed that they can be improved in terms of fit to the deposited experimental X-ray data as well as in terms of geometric quality. The re-refinement protocol uses TLS models to describe concerted atom movement. The resulting structure models are made available through the PDB_REDO databank (http://www.cmbi.ru.nl/pdb_redo/). Grid computing techniques were used to overcome the computational requirements of this endeavour.
Purpose: Urine proteomics is emerging as a powerful tool for biomarker discovery. The purpose of this study is the development of a well-characterized ''real life'' sample that can be used as reference standard in urine clinical proteomics studies. Experimental design: We report on the generation of male and female urine samples that are extensively characterized by different platforms and methods (CE-MS, LC-MS, LC-MS/MS, 1-D gel analysis in combination with nano-LC MS/MS (using LTQ-FT ultra), and 2-DE-MS) for their proteome and peptidome. In several cases analysis involved a definition of the actual biochemical entities, i.e. proteins/peptides associated with molecular mass and detected PTMs and the relative abundance of these compounds. Results: The combination of different technologies allowed coverage of a wide mass range revealing the advantages and complementarities of the different technologies. Application of these samples in ''inter-laboratory'' and ''inter-platform'' data comparison is also demonstrated. Conclusions and clinical relevance: These well-characterized urine samples are freely available upon request to enable data comparison especially in the context of biomarker discovery and validation studies. It is also expected that they will provide the basis for the comprehensive characterization of the urinary proteome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.