Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
Batch effects are the systematic non-biological differences between batches (groups) of samples in microarray experiments due to various causes such as differences in sample preparation and hybridization protocols. Previous work focused mainly on the development of methods for effective batch effects removal. However, their impact on cross-batch prediction performance, which is one of the most important goals in microarray-based applications, has not been addressed. This paper uses a broad selection of data sets from the Microarray Quality Control Phase II (MAQC-II) effort, generated on three microarray platforms with different causes of batch effects to assess the efficacy of their removal. Two data sets from cross-tissue and cross-platform experiments are also included. Of the 120 cases studied using Support vector machines (SVM) and K nearest neighbors (KNN) as classifiers and Matthews correlation coefficient (MCC) as performance metric, we find that Ratio-G, Ratio-A, EJLR, mean-centering and standardization methods perform better or equivalent to no batch effect removal in 89, 85, 83, 79 and 75% of the cases, respectively, suggesting that the application of these methods is generally advisable and ratio-based methods are preferred.
One of the central challenges of hadron physics in the regime of strong (non-perturbative) QCD is to identify the relevant degrees of freedom of the nucleon and to quantitatively explain experimental data in terms of these degrees of freedom. Among the processes studied so far Compton scattering plays a prominent role because of the well understood properties of the electromagnetic interaction. Different approaches to describe Compton scattering have been discussed up to now. It will be shown that the most appropriate ones are provided by nonsubtracted dispersion theories of the fixed-t and fixed-θ types, where the properties of these two versions are complementary so that advantage can be taken from both of them. In the frame of fixed-t dispersion theory it was possible to precisely reproduce experimental differential cross sections obtained for the proton in a wide angular range and for energies up to 1 GeV. At energies of the first resonance region and below, precise values for the electromagnetic polarizabilities and spin-polarizabilities have been determined for the proton and the neutron. As a summary we give the following recommended experimental values for the electromagnetic polarizabilities and backward spin-polarizabilities of the nucleon: α p = 12.0±0.6, β p = 1.9∓0.6, α n = 12.5±1.7, β n = 2.7∓1.8, in units of 10 −4 fm 3 and γ * Supported by Deutsche Forschungsgemeinschaft SPP(1034) and projects SCHU222 and 436RUS 113/510. Email address: mschuma3@gwdg.de (Martin Schumacher)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.