Both the generation and the analysis of proteome data are becoming increasingly widespread, and the field of proteomics is moving incrementally toward high-throughput approaches. Techniques are also increasing in complexity as the relevant technologies evolve. A standard representation of both the methods used and the data generated in proteomics experiments, analogous to that of the MIAME (minimum information about a microarray experiment) guidelines for transcriptomics, and the associated MAGE (microarray gene expression) object model and XML (extensible markup language) implementation, has yet to emerge. This hinders the handling, exchange, and dissemination of proteomics data. Here, we present a UML (unified modeling language) approach to proteomics experimental data, describe XML and SQL (structured query language) implementations of that model, and discuss capture, storage, and dissemination strategies. These make explicit what data might be most usefully captured about proteomics experiments and provide complementary routes toward the implementation of a proteome repository.
A molecular understanding of porcine reproduction is of biological interest and economic importance. Our Midwest Consortium has produced cDNA libraries containing the majority of genes expressed in major female reproductive tissues, and we have deposited into public databases 21,499 expressed sequence tag (EST) gene sequences from the 3' end of clones from these libraries. These sequences represent 10,574 different genes, based on sequence comparison among these data, and comparison with existing porcine ESTs and genes indicate as many as 4652 of these EST clusters are novel. In silico analysis identified sequences that are expressed in specific pig tissues or organs and confirmed the broad expression in pig for many genes ubiquitously expressed in human tissues. Furthermore, we have developed computer software to identify sequence similarity of these pig genes with their human counterparts, and to extract the mapping information of these human homologues from genome databases. We demonstrate the utility of this software for comparative mapping by localizing 61 genes on the porcine physical map for Chromosomes (Chrs) 5, 10, and 14.
Background: Proteomics is rapidly evolving into a high-throughput technology, in which substantial and systematic studies are conducted on samples from a wide range of physiological, developmental, or pathological conditions. Reference maps from 2D gels are widely circulated. However, there is, as yet, no formally accepted standard representation to support the sharing of proteomics data, and little systematic dissemination of comprehensive proteomic data sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.