Gene expression patterns were profiled during somatic embryogenesis in a regeneration-proficient maize hybrid line, Hi II, in an effort to identify genes that might be used as developmental markers or targets to optimize regeneration steps for recovering maize plants from tissue culture. Gene expression profiles were generated from embryogenic calli induced to undergo embryo maturation and germination. Over 1,000 genes in the 12,060 element arrays showed significant time variation during somatic embryo development. A substantial number of genes were downregulated during embryo maturation, largely histone and ribosomal protein genes, which may result from a slowdown in cell proliferation and growth during embryo maturation. The expression of these genes dramatically recovered at germination. Other genes up-regulated during embryo maturation included genes encoding hydrolytic enzymes (nucleases, glucosidases and proteases) and a few storage genes (an alpha-zein and caleosin), which are good candidates for developmental marker genes. Germination is accompanied by the up-regulation of a number of stress response and membrane transporter genes, and, as expected, greening is associated with the up-regulation of many genes encoding photosynthetic and chloroplast components. Thus, some, but not all genes typically associated with zygotic embryogenesis are significantly up or down-regulated during somatic embryogenesis in Hi II maize line regeneration. Although many genes varied in expression throughout somatic embryo development in this study, no statistically significant gene expression changes were detected between total embryogenic callus and callus enriched for transition stage somatic embryos.
In model-based clustering based on normal-mixture models, a few outlying observations can influence the cluster structure and number. This paper develops a method to identify these, however it does not attempt to identify clusters amidst a large field of noisy observations. We identify outliers as those observations in a cluster with minimal membership proportion or for which the cluster-specific variance with and without the observation is very different. Results from a simulation study demonstrate the ability of our method to detect true outliers without falsely identifying many non-outliers and improved performance over other approaches, under most scenarios. We use the contributed R package MCLUST for model-based clustering, but propose a modified prior for the cluster-specific variance which avoids degeneracies in estimation procedures. We also compare results from our outlier method to published results on National Hockey League data.
We analyze data collected in a somatic embryogenesis experiment carried out on Zea mays at Iowa State University. The main objective of the study was to identify the set of genes in maize that actively participate in embryo development. Embryo tissue was sampled and analyzed at various time periods and under different mediums and light conditions. As is the case in many microarray experiments, the operator scanned each slide multiple times to find the slide-specific ‘optimal’ laser and sensor settings. The multiple readings of each slide are repeated measurements on different scales with differing censoring; they cannot be considered to be replicate measurements in the traditional sense. Yet it has been shown that the choice of reading can have an impact on genetic inference. We propose a hierarchical modeling approach to estimating gene expression that combines all available readings on each spot and accounts for censoring in the observed values. We assess the statistical properties of the proposed expression estimates using a simulation experiment. As expected, combining all available scans using an approach with good statistical properties results in expression estimates with noticeably lower bias and root mean squared error relative to other approaches that have been proposed in the literature. Inferences drawn from the somatic embryogenesis experiment, which motivated this work changed drastically when data were analyzed using the standard approaches or using the methodology we propose.
This manuscript is composed of two major sections. In the first section of the manuscript we introduce some of the biological principles that form the bases of cDNA microarrays and explain how the different analytical steps introduce variability and potential biases in gene expression measurements that can sometimes be difficult to properly address. We address statistical issues associated to the measurement of gene expression (e.g., image segmentation, spot identification), to the correction for background fluorescence and to the normalization and re-scaling of data to remove effects of dye, print-tip and others on expression. In this section of the manuscript we also describe the standard statistical approaches for estimating treatment effect on gene expression, and briefly address the multiple comparisons problem, often referred to as the big p small n paradox. In the second major section of the manuscript, we discuss the use of multiple scans as a means to reduce the variability of gene expression estimates. While the use of multiple scans under the same laser and sensor settings has already been proposed (Romualdi et al. 2003), we describe a general hierarchical modeling approach proposed by Love and Carriquiry (2005) that enables use of all the readings obtained under varied laser and sensor settings for each slide in the analyses, even if the number of readings per slide vary across slides. This technique also uses the varied settings to correct for some amount of the censoring discussed in the first section. It is to be expected that when combining scans and correcting for censoring, the estimate of gene expression will have smaller variance than it would have if based on a single spot measurement. In turn, expression estimates with smaller variance are expected to increase the power of statistical tests performed on them.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.