BackgroundA generally accepted approach to the analysis of RNA-Seq read count data does not yet exist. We sequenced the mRNA of 726 individuals from the Drosophila Genetic Reference Panel in order to quantify differences in gene expression among single flies. One of our experimental goals was to identify the optimal analysis approach for the detection of differential gene expression among the factors we varied in the experiment: genotype, environment, sex, and their interactions. Here we evaluate three different filtering strategies, eight normalization methods, and two statistical approaches using our data set. We assessed differential gene expression among factors and performed a statistical power analysis using the eight biological replicates per genotype, environment, and sex in our data set.ResultsWe found that the most critical considerations for the analysis of RNA-Seq read count data were the normalization method, underlying data distribution assumption, and numbers of biological replicates, an observation consistent with previous RNA-Seq and microarray analysis comparisons. Some common normalization methods, such as Total Count, Quantile, and RPKM normalization, did not align the data across samples. Furthermore, analyses using the Median, Quantile, and Trimmed Mean of M-values normalization methods were sensitive to the removal of low-expressed genes from the data set. Although it is robust in many types of analysis, the normal data distribution assumption produced results vastly different than the negative binomial distribution. In addition, at least three biological replicates per condition were required in order to have sufficient statistical power to detect expression differences among the three-way interaction of genotype, environment, and sex.ConclusionsThe best analysis approach to our data was to normalize the read counts using the DESeq method and apply a generalized linear model assuming a negative binomial distribution using either edgeR or DESeq software. Genes having very low read counts were removed after normalizing the data and fitting it to the negative binomial distribution. We describe the results of this evaluation and include recommended analysis strategies for RNA-Seq read count data.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2353-z) contains supplementary material, which is available to authorized users.
Why do some individuals need more sleep than others? Forward mutagenesis screens in flies using engineered mutations have established a clear genetic component to sleep duration, revealing mutants that convey very long or short sleep. Whether such extreme long or short sleep could exist in natural populations was unknown. We applied artificial selection for high and low night sleep duration to an outbred population of Drosophila melanogaster for 13 generations. At the end of the selection procedure, night sleep duration diverged by 9.97 hours in the long and short sleeper populations, and 24-hour sleep was reduced to 3.3 hours in the short sleepers. Neither long nor short sleeper lifespan differed appreciably from controls, suggesting little physiological consequences to being an extreme long or short sleeper. Whole genome sequence data from seven generations of selection revealed several hundred thousand changes in allele frequencies at polymorphic loci across the genome. Combining the data from long and short sleeper populations across generations in a logistic regression implicated 126 polymorphisms in 80 candidate genes, and we confirmed three of these genes and a larger genomic region with mutant and chromosomal deficiency tests, respectively. Many of these genes could be connected in a single network based on previously known physical and genetic interactions. Candidate genes have known roles in several classic, highly conserved developmental and signaling pathways—EGFR, Wnt, Hippo, and MAPK. The involvement of highly pleiotropic pathway genes suggests that sleep duration in natural populations can be influenced by a wide variety of biological processes, which may be why the purpose of sleep has been so elusive.
How functional diversification affects the organization of the transcriptome is a central question in systems genetics. To explore this issue, we sequenced all six Odorant binding protein (Obp) genes located on the X chromosome, four of which occur as a cluster, in 219 inbred wild-derived lines of Drosophila melanogaster and tested for associations between genetic and phenotypic variation at the organismal and transcriptional level. We observed polymorphisms in Obp8a, Obp19a, Obp19b, and Obp19c associated with variation in olfactory responses and polymorphisms in Obp19d associated with variation in life span. We inferred the transcriptional context, or ''niche,'' of each gene by identifying expression polymorphisms where genetic variation in these Obp genes was associated with variation in expression of transcripts genetically correlated to each Obp gene. All six Obp genes occupied a distinct transcriptional niche. Gene ontology enrichment analysis revealed associations of different Obp transcriptional niches with olfactory behavior, synaptic transmission, detection of signals regulating tissue development and apoptosis, postmating behavior and oviposition, and nutrient sensing. Our results show that diversification of the Obp family has organized distinct transcriptional niches that reflect their acquisition of additional functions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.