While it is expected for gene length to be associated with factors such as intron number and evolutionary conservation, we are yet to understand the connections between gene length and function in the human genome. In this study, we show that, as expected, there is a strong positive correlation between gene length, transcript length, and protein size as well as a correlation with the number of genetic variants and introns. Among tissue-specific genes, we find that the longest transcripts tend to be expressed in the blood vessels, nerves, thyroid, cervix uteri, and the brain, while the smallest transcripts tend to be expressed in the pancreas, skin, stomach, vagina, and testis. We report, as shown previously, that natural selection suppresses changes for genes with longer transcripts and promotes changes for genes with smaller transcripts. We also observe that genes with longer transcripts tend to have a higher number of co-expressed genes and protein-protein interactions, as well as more associated publications. In the functional analysis, we show that bigger transcripts are often associated with neuronal development, while smaller transcripts tend to play roles in skin development and in the immune system. Furthermore, pathways related to cancer, neurons, and heart diseases tend to have genes with longer transcripts, with smaller transcripts being present in pathways related to immune responses and neurodegenerative diseases. Based on our results, we hypothesize that longer genes tend to be associated with functions that are important in the early development stages, while smaller genes tend to play a role in functions that are important throughout the whole life, like the immune system, which requires fast responses.
Selective breeding of the domestic dog (Canis lupus familiaris) rigidly retains desirable features, and could inadvertently fix disease-causing variants within a breed. We combine phenotypic data from > 72,000 dogs with a large genotypic dataset to search for genes associated with cancer mortality and longevity in pedigree dog breeds. We validated previous findings that breeds with higher average body weight have higher cancer mortality rates and lower life expectancy. We identified a significant positive correlation between life span and cancer mortality residuals corrected for body weight, implying that long-lived breeds die more frequently from cancer compared to short-lived breeds. We replicated a number of known genetic associations with body weight (IGF1, GHR, CD36, SMAD2 and IGF2BP2). Subsequently, we identified five genetic variants in known cancer-related genes (located within SIPA1, ADCY7 and ARNT2) that could be associated with cancer mortality residuals corrected for confounding factors. One putative genetic variant was marginally significantly associated with longevity residuals that had been corrected for the effects of body weight; this genetic variant is located within PRDX1, a peroxiredoxin that belongs to an emerging class of pro-longevity associated genes. This research should be considered as an exploratory analysis to uncover associations between genes and longevity/cancer mortality.
Within primates, the great apes are outliers both in terms of body size and lifespan, since they include the largest and longest-lived species in the order. Yet, the molecular bases underlying such features are poorly understood. Here, we leveraged an integrated approach to investigate multiple sources of molecular variation across primates, focusing on over ten thousand genes, including ∼1,500 previously associated with lifespan, and additional ∼9,000 for which an association with longevity has never been suggested. We analyzed dN/dS rates, positive selection, gene expression (RNA-seq) and gene regulation (ChIP-seq). By analyzing the correlation between dN/dS, maximum lifespan and body mass we identified 276 genes whose rate of evolution positively correlates with maximum lifespan in primates. Further, we identified 5 genes, important for tumor suppression, adaptive immunity, metastasis and inflammation, under positive selection exclusively in the great ape lineage. RNA-seq data, generated from the liver of six species representing all the primate lineages, revealed that 8% of ∼1,500 genes previously associated with longevity are differentially expressed in apes relative to other primates. Importantly, by integrating RNA-seq with ChIP-seq for H3K27ac (which marks active enhancers), we show that the differentially expressed longevity genes are significantly more likely than expected to be located near a novel “ape-specific” enhancer. Moreover, these particular ape-specific enhancers are enriched for young transposable elements, and specifically SINE-Vntr-Alus (SVAs). In summary, we demonstrate that multiple evolutionary forces have contributed to the evolution of lifespan and body size in primates.
Gene co-expression analysis has emerged as a powerful method to provide insights into gene function and regulation. The rapid growth of publicly available RNA-sequencing (RNA-seq) data has created opportunities for researchers to employ this abundant data to help decipher the complexity and biology of genomes. Co-expression networks have proven effective for inferring the relationship between the genes, for gene prioritization and for assigning function to poorly annotated genes based on their co-expressed partners. To facilitate such analyses we created previously an online co-expression tool for humans and mice entitled GeneFriends. To continue providing a valuable tool to the scientific community, we have now updated the GeneFriends database and website. Here, we present the new version of GeneFriends, which includes gene and transcript co-expression networks based on RNA-seq data from 46 475 human and 34 322 mouse samples. The new database also encompasses tissue-specific gene co-expression networks for 20 human and 21 mouse tissues, dataset-specific gene co-expression maps based on TCGA and GTEx projects and gene co-expression networks for additional seven model organisms (fruit fly, zebrafish, worm, rat, yeast, cow and chicken). GeneFriends is freely available at http://www.genefriends.org/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.