RNA-seq facilitates unbiased genome-wide gene-expression profiling. However, its concordance with the well-established microarray platform must be rigorously assessed for confident uses in clinical and regulatory application. Here we use a comprehensive study design to generate Illumina RNA-seq and Affymetrix microarray data from the same set of liver samples of rats under varying degrees of perturbation by 27 chemicals representing multiple modes of action (MOA). The cross-platform concordance in terms of differentially expressed genes (DEGs) or enriched pathways is highly correlated with treatment effect size, gene-expression abundance and the biological complexity of the MOA. RNA-seq outperforms microarray (90% versus 76%) in DEG verification by quantitative PCR and the main gain is its improved accuracy for low expressed genes. Nonetheless, predictive classifiers derived from both platforms performed similarly. Therefore, the endpoint studied and its biological complexity, transcript abundance, and intended application are important factors in transcriptomic research and for decision-making.
Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
We show that the Confusion Entropy, a measure of performance in multiclass problems has a strong (monotone) relation with the multiclass generalization of a classical metric, the Matthews Correlation Coefficient. Analytical results are provided for the limit cases of general no-information (n-face dice rolling) of the binary classification. Computational evidence supports the claim in the general case.
The life sciences are currently being transformed by an unprecedented wave of developments in molecular analysis, which include important advances in instrumental analysis as well as biocomputing. In light of the central role played by metabolism in nutrition, metabolomics is rapidly being established as a key analytical tool in human nutritional studies. Consequently, an increasing number of nutritionists integrate metabolomics into their study designs. Within this dynamic landscape, the potential of nutritional metabolomics (nutrimetabolomics) to be translated into a science, which can impact on health policies, still needs to be realized. A key element to reach this goal is the ability of the research community to join, to collectively make the best use of the potential offered by nutritional metabolomics. This article, therefore, provides a methodological description of nutritional metabolomics that reflects on the state-of-the-art techniques used in the laboratories of the Food Biomarker Alliance (funded by the European Joint Programming Initiative "A Healthy Diet for a Healthy Life" (JPI HDHL)) as well as points of reflections to harmonize this field. It is not intended to be exhaustive but rather to present a pragmatic guidance on metabolomic methodologies, providing readers with useful "tips and tricks" along the analytical workflow.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.