Lung cancer is the leading cause of death from cancer in the US and the world. The high mortality rate (80-85% within 5 years) results, in part, from a lack of effective tools to diagnose the disease at an early stage. Given that cigarette smoke creates a field of injury throughout the airway, we sought to determine if gene expression in histologically normal large-airway epithelial cells obtained at bronchoscopy from smokers with suspicion of lung cancer could be used as a lung cancer biomarker. Using a training set (n = 77) and gene-expression profiles from Affymetrix HG-U133A microarrays, we identified an 80-gene biomarker that distinguishes smokers with and without lung cancer. We tested the biomarker on an independent test set (n = 52), with an accuracy of 83% (80% sensitive, 84% specific), and on an additional validation set independently obtained from five medical centers (n = 35). Our biomarker had approximately 90% sensitivity for stage 1 cancer across all subjects. Combining cytopathology of lower airway cells obtained at bronchoscopy with the biomarker yielded 95% sensitivity and a 95% negative predictive value. These findings indicate that gene expression in cytologically normal large-airway epithelial cells can serve as a lung cancer biomarker, potentially owing to a cancer-specific airway-wide response to cigarette smoke.
Fetal hemoglobin (HbF) is the major genetic modulator of the hematologic and clinical features of sickle cell disease, an effect mediated by its exclusion from the sickle hemoglobin polymer. Fetal hemoglobin genes are genetically regulated, and the level of HbF and its distribution among sickle erythrocytes is highly variable. Some patients with sickle cell disease have exceptionally high levels of HbF that are associated with the Senegal and Saudi-Indian haplotype of the HBB-like gene cluster; some patients with different haplotypes can have similarly high HbF. In these patients, high HbF is associated with generally milder but not asymptomatic disease. Studying these persons might provide additional insights into HbF gene regulation. HbF appears to benefit some complications of disease more than others. This might be related to the premature destruction of erythrocytes that do not contain HbF, even though the total HbF concentration is high. Recent insights into HbF regulation have spurred new efforts to induce high HbF levels in sickle cell disease beyond those achievable with the current limited repertory of HbF inducers.
Like most complex phenotypes, exceptional longevity is thought to reflect a combined influence of environmental (e.g., lifestyle choices, where we live) and genetic factors. To explore the genetic contribution, we undertook a genome-wide association study of exceptional longevity in 801 centenarians (median age at death 104 years) and 914 genetically matched healthy controls. Using these data, we built a genetic model that includes 281 single nucleotide polymorphisms (SNPs) and discriminated between cases and controls of the discovery set with 89% sensitivity and specificity, and with 58% specificity and 60% sensitivity in an independent cohort of 341 controls and 253 genetically matched nonagenarians and centenarians (median age 100 years). Consistent with the hypothesis that the genetic contribution is largest with the oldest ages, the sensitivity of the model increased in the independent cohort with older and older ages (71% to classify subjects with an age at death>102 and 85% to classify subjects with an age at death>105). For further validation, we applied the model to an additional, unmatched 60 centenarians (median age 107 years) resulting in 78% sensitivity, and 2863 unmatched controls with 61% specificity. The 281 SNPs include the SNP rs2075650 in TOMM40/APOE that reached irrefutable genome wide significance (posterior probability of association = 1) and replicated in the independent cohort. Removal of this SNP from the model reduced the accuracy by only 1%. Further in-silico analysis suggests that 90% of centenarians can be grouped into clusters characterized by different “genetic signatures” of varying predictive values for exceptional longevity. The correlation between 3 signatures and 3 different life spans was replicated in the combined replication sets. The different signatures may help dissect this complex phenotype into sub-phenotypes of exceptional longevity.
Human longevity is heritable, but genome-wide association (GWA) studies have had limited success. Here, we perform two meta-analyses of GWA studies of a rigorous longevity phenotype definition including 11,262/3484 cases surviving at or beyond the age corresponding to the 90th/99th survival percentile, respectively, and 25,483 controls whose age at death or at last contact was at or below the age corresponding to the 60th survival percentile. Consistent with previous reports, rs429358 (apolipoprotein E (ApoE) ε4) is associated with lower odds of surviving to the 90th and 99th percentile age, while rs7412 (ApoE ε2) shows the opposite. Moreover, rs7676745, located near GPR78 , associates with lower odds of surviving to the 90th percentile age. Gene-level association analysis reveals a role for tissue-specific expression of multiple genes in longevity. Finally, genetic correlation of the longevity GWA results with that of several disease-related phenotypes points to a shared genetic architecture between health and longevity.
This article presents a Bayesian method for model-based clustering of gene expression dynamics. The method represents geneexpression dynamics as autoregressive equations and uses an agglomerative procedure to search for the most probable set of clusters given the available data. The main contributions of this approach are the ability to take into account the dynamic nature of gene expression time series during clustering and a principled way to identify the number of distinct clusters. As the number of possible clustering models grows exponentially with the number of observed time series, we have devised a distance-based heuristic search procedure able to render the search process feasible. In this way, the method retains the important visualization capability of traditional distance-based clustering and acquires an independent, principled measure to decide when two series are different enough to belong to different clusters. The reliance of this method on an explicit statistical representation of gene expression dynamics makes it possible to use standard statistical techniques to assess the goodness of fit of the resulting model and validate the underlying assumptions. A set of gene-expression time series, collected to study the response of human fibroblasts to serum, is used to identify the properties of the method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.