Regulatory variation in gene expression can be described by cis-and trans-genetic components. Here we used RNA-seq data from a population panel of Drosophila melanogaster test crosses to compare allelic imbalance (AI) in female head tissue between mated and virgin flies, an environmental change known to affect transcription. Indeed, 3048 exons (1610 genes) are differentially expressed in this study. A Bayesian model for AI, with an intersection test, controls type I error. There are 200 genes with AI exclusively in mated or virgin flies, indicating an environmental component of expression regulation. On average 34% of genes within a cross and 54% of all genes show evidence for genetic regulation of transcription. Nearly all differentially regulated genes are affected in cis, with an average of 63% of expression variation explained by the cis-effects. Trans-effects explain 8% of the variance in AI on average and the interaction between cis and trans explains an average of 11% of the total variance in AI. In both environments cis-and trans-effects are compensatory in their overall effect, with a negative association between cis-and trans-effects in 85% of the exons examined. We hypothesize that the gene expression level perturbed by cis-regulatory mutations is compensated through trans-regulatory mechanisms, e. g., trans and cis by trans-factors buffering cis-mutations. In addition, when AI is detected in both environments, cis-mated, cis-virgin, and trans-mated-trans-virgin estimates are highly concordant with 99% of all exons positively correlated with a median correlation of 0.83 for cis and 0.95 for trans. We conclude that the gene regulatory networks (GRNs) are robust and that trans-buffering explains robustness.
Background: One method of identifying cis regulatory differences is to analyze allele-specific expression (ASE) and identify cases of allelic imbalance (AI). RNA-seq is the most common way to measure ASE and a binomial test is often applied to determine statistical significance of AI. This implicitly assumes that there is no bias in estimation of AI. However, bias has been found to result from multiple factors including: genome ambiguity, reference quality, the mapping algorithm, and biases in the sequencing process. Two alternative approaches have been developed to handle bias: adjusting for bias using a statistical model and filtering regions of the genome suspected of harboring bias. Existing statistical models which account for bias rely on information from DNA controls, which can be cost prohibitive for large intraspecific studies. In contrast, data filtering is inexpensive and straightforward, but necessarily involves sacrificing a portion of the data. Results:Here we propose a flexible Bayesian model for analysis of AI, which accounts for bias and can be implemented without DNA controls. In lieu of DNA controls, this Poisson-Gamma (PG) model uses an estimate of bias from simulations. The proposed model always has a lower type I error rate compared to the binomial test. Consistent with prior studies, bias dramatically affects the type I error rate. All of the tested models are sensitive to misspecification of bias. The closer the estimate of bias is to the true underlying bias, the lower the type I error rate. Correct estimates of bias result in a level alpha test. Conclusions:To improve the assessment of AI, some forms of systematic error (e.g., map bias) can be identified using simulation. The resulting estimates of bias can be used to correct for bias in the PG model, without data filtering. Other sources of bias (e.g., unidentified variant calls) can be easily captured by DNA controls, but are missed by common filtering approaches. Consequently, as variant identification improves, the need for DNA controls will be reduced. Filtering does not significantly improve performance and is not recommended, as information is sacrificed without a measurable gain. The PG model developed here performs well when bias is known, or slightly misspecified. The model is flexible and can accommodate differences in experimental design and bias estimation.
OBJECTIVEGut microbiome dysbiosis is associated with numerous diseases, including type 1 diabetes. This pilot study determines how geographical location affects the microbiome of infants at high risk for type 1 diabetes in a population of homogenous HLA class II genotypes.RESEARCH DESIGN AND METHODSHigh-throughput 16S rRNA sequencing was performed on stool samples collected from 90 high-risk, nonautoimmune infants participating in The Environmental Determinants of Diabetes in the Young (TEDDY) study in the U.S., Germany, Sweden, and Finland.RESULTSStudy site–specific patterns of gut colonization share characteristics across continents. Finland and Colorado have a significantly lower bacterial diversity, while Sweden and Washington state are dominated by Bifidobacterium in early life. Bacterial community diversity over time is significantly different by geographical location.CONCLUSIONSThe microbiome of high-risk infants is associated with geographical location. Future studies aiming to identify the microbiome disease phenotype need to carefully consider the geographical origin of subjects.
Previous research has examined time-varying relations between smoking and negative affect, urge to smoke, smoking dependence, and certain smoking cessation therapies. We extend this work using ILD of unexplored variables in a socioeconomically disadvantaged sample of smokers seeking cessation treatment. These findings could be used to inform ecological momentary interventions that deliver treatment resources (eg, video- or text-based content) to individuals based upon critical variables surrounding their attempt.
We describe a new variable selection procedure for categorical responses where the candidate models are all probit regression models. The procedure uses objective intrinsic priors for the model parameters, which do not depend on tuning parameters, and ranks the models for the different subsets of covariates according to their model posterior probabilities. When the number of covariates is moderate or large, the number of potential models can be very large, and for those cases, we derive a new stochastic search algorithm that explores the potential sets of models driven by their model posterior probabilities. The algorithm allows the user to control the dimension of the candidate models and thus can handle situations when the number of covariates exceed the number of observations. We assess, through simulations, the performance of the procedure and apply the variable selector to a gene expression data set, where the response is whether a patient exhibits pneumonia. Software needed to run the procedures is available in the R package varselectIP.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.