BackgroundThere is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published genomes, but it still needs improvement in continuity, accuracy and coverage, especially in repeat regions.FindingsTo overcome these challenges, we have developed its successor, SOAPdenovo2, which has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closing, and optimizes for large genome.ConclusionsBenchmark using the Assemblathon1 and GAGE datasets showed that SOAPdenovo2 greatly surpasses its predecessor SOAPdenovo and is competitive to other assemblers on both assembly length and accuracy. We also provide an updated assembly version of the 2008 Asian (YH) genome using SOAPdenovo2. Here, the contig and scaffold N50 of the YH genome were ~20.9 kbp and ~22 Mbp, respectively, which is 3-fold and 50-fold longer than the first published version. The genome coverage increased from 81.16% to 93.91%, and memory consumption was ~2/3 lower during the point of largest memory consumption.
MicroRNAs (miRNAs) are small non-coding RNAs that function as negative gene expression regulators. Emerging evidence shows that, except for function in the cytoplasm, miRNAs are also present in the nucleus. However, the functional significance of nuclear miRNAs remains largely undetermined. By screening miRNA database, we have identified a subset of miRNA that functions as enhancer regulators. Here, we found a set of miRNAs show gene-activation function. We focused on miR-24-1 and found that this miRNA unconventionally activates gene transcription by targeting enhancers. Consistently, the activation was completely abolished when the enhancer sequence was deleted by TALEN. Furthermore, we found that miR-24-1 activates enhancer RNA (eRNA) expression, alters histone modification, and increases the enrichment of p300 and RNA Pol II at the enhancer locus. Our results demonstrate a novel mechanism of miRNA as an enhancer trigger.
We have developed a new method, SOAPfuse, to identify fusion transcripts from paired-end RNA-Seq data. SOAPfuse applies an improved partial exhaustion algorithm to construct a library of fusion junction sequences, which can be used to efficiently identify fusion events, and employs a series of filters to nominate high-confidence fusion transcripts. Compared with other released tools, SOAPfuse achieves higher detection efficiency and consumed less computing resources. We applied SOAPfuse to RNA-Seq data from two bladder cancer cell lines, and confirmed 15 fusion transcripts, including several novel events common to both cell lines. SOAPfuse is available at http://soap.genomics.org.cn/soapfuse.html.
The blind mole rat (BMR), Spalax galili, is an excellent model for studying mammalian adaptation to life underground and medical applications. The BMR spends its entire life underground, protecting itself from predators and climatic fluctuations while challenging it with multiple stressors such as darkness, hypoxia, hypercapnia, energetics and high pathonecity. Here we sequence and analyse the BMR genome and transcriptome, highlighting the possible genomic adaptive responses to the underground stressors. Our results show high rates of RNA/DNA editing, reduced chromosome rearrangements, an over-representation of short interspersed elements (SINEs) probably linked to hypoxia tolerance, degeneration of vision and progression of photoperiodic perception, tolerance to hypercapnia and hypoxia and resistance to cancer. The remarkable traits of the BMR, together with its genomic and transcriptomic information, enhance our understanding of adaptation to extreme environments and will enable the utilization of BMR models for biomedical research in the fight against cancer, stroke and cardiovascular diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.