Alternative polyadenylation is an essential RNA processing event that contributes significantly to regulation of transcriptome diversity and functional dynamics in both animals and plants. Here we review newly developed next generation sequencing methods for genome-wide profiling of alternative polyadenylation (APA) sites, bioinformatics pipelines for data processing and both wet and dry laboratory approaches for APA validation. The library construction methods LITE-Seq (Low-Input 3'-Terminal sequencing) and PAC-seq (PolyA Click sequencing) tag polyA+ cDNA, while BAT-seq (BArcoded, three-prime specific sequencing) and PAPERCLIP (Poly(A) binding Protein-mediated mRNA 3′End Retrieval by CrossLinking ImmunoPrecipitation) enrich polyA+ RNA. Interestingly, only WTTS-seq (Whole Transcriptome Termini Site sequencing) targets both polyA+ RNA and polyA+ cDNA. Varieties of bioinformatics pipelines are well established to pursue read quality control, mapping, clustering, characterization and pathway analysis. The RHAPA (RNase H alternative polyadenylation assay) and 3'RACE-seq (3' rapid amplification of cDNA end sequencing) methods directly validate APA sites, while WTSS-seq (whole transcriptome start site sequencing), RNA-seq (RNA sequencing) and public APA databases can serve as indirect validation methods. We hope that these tools, pipelines and resources trigger huge waves of interest in the research community to investigate APA events underlying physiological, pathological and psychological changes and thus understand the information transfer events from genome to phenome relevant to economically important traits in both animals and plants.
Whether or not DNA variation changes genome-wide nucleotide compositions remains largely unknown. By examining 4,604,291 DNA variants between two rat strains, we observed that sequencing depth is strongly correlated with genome content as 21.41, 38.36, 44.26 and 6512.70 average reads per locus were collected for Y, X, autosomes and mitochondrial (MT) genomes; respectively (P<0.0001). The mutation rates corresponding to these four genome subsets were 0.055, 0.401, 1.733 and 4.475 variants per kb (P<0.0001), confirming the links between recombination frequencies and DNA variability. Although SNPs (single nucleotide polymorphisms) tend to reduce AT content, more CG deletions than CG insertions (INDELs) implies the GC content would not increase. Therefore, the SNP-INDEL interplay may play a key role in maintenance of the AT-rich genomes in rat during evolution. Formation of CpG sites appear to be hindered because genome-wide G INDELs (1.38%) with C as the 5’-nucleotide and CG INDELs (1.19%) are rare. However, the relatively high C—>G/G—>C rate in 5’UTRs (untranslated regions) and G/C INDELs in the 5’UTR and/or exonic regions highlight their importance for execution of gene function. Our study provides evidence that DNA variation does not jeopardize genome stability and functional conservation during evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.