Metabolomics is a powerful tool for identifying both known and new disease-related perturbations in metabolic pathways. In preclinical drug testing, it has a high potential for early identification of drug off-target effects. Recent advances in high-precision high-throughput mass spectrometry have brought the metabolomic field to a point where quantitative, targeted, metabolomic measurements with ready-to-use kits allow for the automated in-house screening for hundreds of different metabolites in large sets of biological samples. Today, the field of metabolomics is, arguably, at a point where transcriptomics was about 5 yr ago. This being so, the field has a strong need for adapted bioinformatics tools and methods. In this paper we describe a systematic analysis of a targeted quantitative characterization of more than 800 metabolites in blood plasma samples from healthy and diabetic mice under rosiglitazone treatment. We show that known and new metabolic phenotypes of diabetes and medication can be recovered in a statistically objective manner. We find that concentrations of methylglutaryl carnitine are oppositely impacted by rosiglitazone treatment of both healthy and diabetic mice. Analyzing ratios between metabolite concentrations dramatically reduces the noise in the data set, allowing for the discovery of new potential biomarkers of diabetes, such as the N-hydroxyacyloylsphingosyl-phosphocholines SM(OH)28:0 and SM(OH)26:0. Using a hierarchical clustering technique on partial eta(2) values, we identify functionally related groups of metabolites, indicating a diabetes-related shift from lysophosphatidylcholine to phosphatidylcholine levels. The bioinformatics data analysis approach introduced here can be readily generalized to other drug testing scenarios and other medical disorders.
Key Points North American ATLL has a distinct genomic landscape with a high frequency of prognostic epigenetic mutations, including EP300 mutations. ATLL samples with mutated EP300 have compromised p53 function and are selectively sensitive to decitabine treatment.
BackgroundData generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques.ResultsThe analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library (MVpower) that implements the simulation strategy proposed in this paper.ConclusionNo single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
Drug-induced liver injury (DILI) is the primary adverse event that results in withdrawal of drugs from the market and a frequent reason for the failure of drug candidates in development. The Liver Toxicity Biomarker Study (LTBS) is an innovative approach to investigate DILI because it compares molecular events produced in vivo by compound pairs that (a) are similar in structure and mechanism of action, (b) are associated with few or no signs of liver toxicity in preclinical studies, and (c) show marked differences in hepatotoxic potential. The LTBS is a collaborative preclinical research effort in molecular systems toxicology between the National Center for Toxicological Research and BG Medicine, Inc., and is supported by seven pharmaceutical companies and three technology providers. In phase I of the LTBS, entacapone and tolcapone were studied in rats to provide results and information that will form the foundation for the design and implementation of phase II. Molecular analysis of the rat liver and plasma samples combined with statistical analyses of the resulting datasets yielded marker analytes, illustrating the value of the broad-spectrum, molecular systems analysis approach to studying pharmacological or toxicological effects.
This paper aims to investigate information-theoretic network complexity measures which have already been intensely used in mathematical- and medicinal chemistry including drug design. Numerous such measures have been developed so far but many of them lack a meaningful interpretation, e.g., we want to examine which kind of structural information they detect. Therefore, our main contribution is to shed light on the relatedness between some selected information measures for graphs by performing a large scale analysis using chemical networks. Starting from several sets containing real and synthetic chemical structures represented by graphs, we study the relatedness between a classical (partition-based) complexity measure called the topological information content of a graph and some others inferred by a different paradigm leading to partition-independent measures. Moreover, we evaluate the uniqueness of network complexity measures numerically. Generally, a high uniqueness is an important and desirable property when designing novel topological descriptors having the potential to be applied to large chemical databases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.