Numerous methods for classifying gene activity states based on gene expression data have been proposed for use in downstream applications, such as incorporating transcriptomics data into metabolic models in order to improve resulting flux predictions. These methods often attempt to classify gene activity for each gene in each experimental condition as belonging to one of two states: active (the gene product is part of an active cellular mechanism) or inactive (the cellular mechanism is not active). These existing methods of classifying gene activity states suffer from multiple limitations, including enforcing unrealistic constraints on the overall proportions of active and inactive genes, failing to leverage a priori knowledge of gene co-regulation, failing to account for differences between genes, and failing to provide statistically meaningful confidence estimates. We propose a flexible Bayesian approach to classifying gene activity states based on a Gaussian mixture model. The model integrates genome-wide transcriptomics data from multiple conditions and information about gene co-regulation to provide activity state confidence estimates for each gene in each condition. We compare the performance of our novel method to existing methods on both simulated data and real data from 907 E. coli gene expression arrays, as well as a comparison with experimentally measured flux values in 29 conditions, demonstrating that our method provides more consistent and accurate results than existing methods across a variety of metrics.
Background Health insurance plays a critical role in the accessibility to and quality of health care for patients with melanoma in the United States. Current knowledge regarding the association between insurance status and stage of melanoma is limited because few studies to date have simultaneously controlled for factors known to influence the risk of diagnosis of late‐stage melanoma. The current study was conducted to examine the association between health insurance status and stage of melanoma at the time of diagnosis in nonelderly adults, accounting for known risk factors for late‐stage diagnosis. Methods In this cross‐sectional study, the authors analyzed the National Cancer Data Base for cases of invasive melanoma diagnosed between 2004 and 2015 among individuals aged 26 to 64 years. Using the American Joint Committee on Cancer melanoma staging system, early‐stage melanoma was defined as stage I or stage II whereas late‐stage melanoma was defined as stage III or stage IV. Late‐stage diagnosis was the primary outcome compared across 4 insurance types (private, Medicaid, none, and unknown). Adjusted covariates were age, sex, race/ethnicity, educational level, income, year of diagnosis, number of comorbidities, and facility location. Logistic regression was used for univariable and multivariable analyses. Results Among 177,247 cases, individuals with Medicaid or no health insurance were found to have 3.12 (95% CI, 2.97‐3.28) and 2.21 (95% CI, 2.10‐2.33) times greater odds, respectively, of being diagnosed with late‐stage melanoma compared with individuals with private insurance after adjusting for risk factors in late‐stage diagnosis. Conclusions Future investigation into insurance disparities in the diagnosis of late‐stage melanoma may help to prioritize melanoma screening in populations with nonprivate insurance.
The new class of rare variant tests has usually been evaluated assuming perfect genotype information. In reality, rare variant genotypes may be incorrect, and so rare variant tests should be robust to imperfect data. Errors and uncertainty in SNP genotyping are already known to dramatically impact statistical power for single marker tests on common variants and, in some cases, inflate the type I error rate. Recent results show that uncertainty in genotype calls derived from sequencing reads are dependent on several factors, including read depth, calling algorithm, number of alleles present in the sample, and the frequency at which an allele segregates in the population. We have recently proposed a general framework for the evaluation and investigation of rare variant tests of association, classifying most rare variant tests into one of two broad categories (length or joint tests). We use this framework to relate factors affecting genotype uncertainty to the power and type I error rate of rare variant tests. We find that non-differential genotype errors (an error process that occurs independent of phenotype) decrease power, with larger decreases for extremely rare variants, and for the common homozygote to heterozygote error. Differential genotype errors (an error process that is associated with phenotype status), lead to inflated type I error rates which are more likely to occur at sites with more common homozygote to heterozygote errors than vice versa. Finally, our work suggests that certain rare variant tests and study designs may be more robust to the inclusion of genotype errors. Further work is needed to directly integrate genotype calling algorithm decisions, study costs and test statistic choices to provide comprehensive design and analysis advice which appropriately accounts for the impact of genotype errors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.