Measures of RNA abundance are important for many areas of biology and often obtained from high-throughput RNA sequencing methods such as Illumina sequence data. These measures need to be normalized to remove technical biases inherent in the sequencing approach, most notably the length of the RNA species and the sequencing depth of a sample. These biases are corrected in the widely used reads per kilobase per million reads (RPKM) measure. Here, we argue that the intended meaning of RPKM is a measure of relative molar RNA concentration (rmc) and show that for each set of transcripts the average rmc is a constant, namely the inverse of the number of transcripts mapped. Further, we show that RPKM does not respect this invariance property and thus cannot be an accurate measure of rmc. We propose a slight modification of RPKM that eliminates this inconsistency and call it TPM for transcripts per million. TPM respects the average invariance and eliminates statistical biases inherent in the RPKM measure.
A network of interactions is called modular if it is subdivided into relatively autonomous, internally highly connected components. Modularity has emerged as a rallying point for research in developmental and evolutionary biology (and specifically evo-devo), as well as in molecular systems biology. Here we review the evidence for modularity and models about its origin. Although there is an emerging agreement that organisms have a modular organization, the main open problem is the question of whether modules arise through the action of natural selection or because of biased mutational mechanisms.
It was first noticed 100 years ago that mutations tend to affect more than one phenotypic characteristic, a phenomenon that was called 'pleiotropy'. Because pleiotropy was found so frequently, the notion arose that pleiotropy is 'universal'. However, quantitative estimates of pleiotropy have not been available until recently. These estimates show that pleiotropy is highly restricted and are more in line with the notion of variational modularity than with universal pleiotropy. This finding has major implications for the evolvability of complex organisms and the mapping of disease-causing mutations.
SummaryAlternative inclusion of exons increases the functional diversity of proteins. Among alternatively spliced exons, tissue-specific exons play a critical role in maintaining tissue identity. This raises the question of how tissue-specific protein-coding exons influence protein function. Here we investigate the structural, functional, interaction, and evolutionary properties of constitutive, tissue-specific, and other alternative exons in human. We find that tissue-specific protein segments often contain disordered regions, are enriched in posttranslational modification sites, and frequently embed conserved binding motifs. Furthermore, genes containing tissue-specific exons tend to occupy central positions in interaction networks and display distinct interaction partners in the respective tissues, and are enriched in signaling, development, and disease genes. Based on these findings, we propose that tissue-specific inclusion of disordered segments that contain binding motifs rewires interaction networks and signaling pathways. In this way, tissue-specific splicing may contribute to functional versatility of proteins and increases the diversity of interaction networks across tissues.
A fundamental challenge in biology is explaining the origin of novel phenotypic characters such as new cell types; the molecular mechanisms that give rise to novelties are unclear. We explored the gene regulatory landscape of mammalian endometrial cells using comparative RNA-Seq and found that 1,532 genes were recruited into endometrial expression in placental mammals, indicating that the evolution of pregnancy was associated with a large-scale rewiring of the gene regulatory network. About 13% of recruited genes are within 200 kb of a Eutherian-specific transposable element (MER20). These transposons have the epigenetic signatures of enhancers, insulators and repressors, directly bind transcription factors essential for pregnancy and coordinately regulate gene expression in response to progesterone and cAMP. We conclude that the transposable element, MER20, contributed to the origin of a novel gene regulatory network dedicated to pregnancy in placental mammals, particularly by recruiting the cAMP signaling pathway into endometrial stromal cells.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.