Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
Background: In recent years, the maturation of microarray technology has allowed the genomewide analysis of gene expression patterns to identify tissue-specific and ubiquitously expressed ('housekeeping') genes. We have performed a functional and topological analysis of housekeeping and tissue-specific networks to identify universally necessary biological processes, and those unique to or characteristic of particular tissues.
A single cancer cell contains large numbers of genetic alterations that in combination create the malignant phenotype. However, whether amplified and mutated genes form functional and physical interaction networks that could explain the selection for cells with combined alterations is unknown. To investigate this issue, we characterized copy number alterations in 191 breast tumors using dense single nucleotide polymorphism arrays and identified 1,747 genes with copy number gain organized into 30 amplicons. Amplicons were distributed unequally throughout the genome. Each amplicon had distinct enrichment pattern in pathways, networks, and molecular functions, but genes within individual amplicons did not form coherent functional units. Genes in amplicons included all major tumorigenic pathways and were highly enriched in breast cancer-causative genes. In contrast, 1,188 genes with somatic mutations in breast cancer were distributed randomly over the genome, did not represent a functionally cohesive gene set, and were relatively less enriched in breast cancer marker genes. Mutated and gained genes did not show statistically significant overlap but were highly synergistic in populating key tumorigenic pathways including transforming growth factor B, WNT, fibroblast growth factor, and PIP3 signaling. In general, mutated genes were more frequently upstream of gained genes in transcription regulation signaling than vice versa, suggesting that mutated genes are mainly regulators, whereas gained genes are mostly regulated. ESR1 was the major transcription factor regulating amplified but not mutated genes. Our results support the hypothesis that multiple genetic events, including copy number gains and somatic mutations, are necessary for establishing the malignant cell phenotype. [Cancer Res 2008;68(22):9532-40]
Genomic biomarkers for the detection of drug-induced liver injury (DILI) from blood are urgently needed for monitoring drug safety. We used a unique data set as part of the Food and Drug Administration led MicroArray Quality Control Phase-II (MAQC-II) project consisting of gene expression data from the two tissues (blood and liver) to test cross-tissue predictability of genomic indicators to a form of chemically-induced liver injury. We then use the genomic indicators from the blood as biomarkers for prediction of acetaminophen-induced liver injury and show that the cross tissue predictability of a response to the pharmaceutical agent (accuracy as high as 92.1%) is better than, or at least comparable to, that of non-therapeutic compounds. We provide a database of gene expression for the highly informative predictors which brings biological context to the possible mechanisms involved in DILI. Pathway-based predictors were associated with inflammation, angiogenesis, Toll-like receptor signaling, apoptosis and mitochondrial damage. The results demonstrate for the first time and support the hypothesis that genomic indicators in the blood can serve as potential diagnostic biomarkers predictive of DILI.
Gene expression signatures of toxicity and clinical response benefit both safety assessment and clinical practice; however, difficulties in connecting signature genes with the predicted end points have limited their application. The Microarray Quality Control Consortium II (MAQCII) project generated 262 signatures for ten clinical and three toxicological end points from six gene expression data sets, an unprecedented collection of diverse signatures that has permitted a wide-ranging analysis on the nature of such predictive models. A comprehensive analysis of the genes of these signatures and their nonredundant unions using ontology enrichment, biological network building and interactome connectivity analyses demonstrated the link between gene signatures and the biological basis of their predictive power. Different signatures for a given end point were more similar at the level of biological properties and transcriptional control than at the gene level. Signatures tended to be enriched in function and pathway in an end point and model-specific manner, and showed a topological bias for incoming interactions. Importantly, the level of biological similarity between different signatures for a given end point correlated positively with the accuracy of the signature predictions. These findings will aid the understanding, and application of predictive genomic signatures, and support their broader application in predictive medicine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.