Long nanopore reads are advantageous in de novo genome assembly. However, nanopore reads usually have broad error distribution and high-error-rate subsequences. Existing error correction tools cannot correct nanopore reads efficiently and effectively. Most methods trim high-error-rate subsequences during error correction, which reduces both the length of the reads and contiguity of the final assembly. Here, we develop an error correction, and de novo assembly tool designed to overcome complex errors in nanopore reads. We propose an adaptive read selection and two-step progressive method to quickly correct nanopore reads to high accuracy. We introduce a two-stage assembler to utilize the full length of nanopore reads. Our tool achieves superior performance in both error correction and de novo assembling nanopore reads. It requires only 8122 hours to assemble a 35X coverage human genome and achieves a 2.47-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line shows an NG50 of 22 Mbp. The high-quality assembly of nanopore reads can significantly reduce false positives in structure variation detection.
The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
On the basis of a selected pair of physicochemical properties of amino acids, we introduce a dynamic 2D graphical representation of protein sequences. Then, we introduce and compare two numerical characterizations of protein graphs as descriptors to analyze the nine ND5 proteins. The approach is simple, convenient, and fast.
BackgroundOur recent study showed the global physiological function of the differentially expressed genes of prostate cancer in Chinese patients was different from that of other non-Chinese populations. microRNA are estimated to regulate the expression of greater than 60% of all protein-coding genes. To further investigate the global association between the transcript abundance of miRNAs and their target mRNAs in Chinese patients, we used microRNA microarray approach combined with bioinformatics and clinical-pathological assay to investigate the miRNA profile and evaluate the potential of miRNAs as diagnostic and prognostic markers in Chinese patients.ResultsA total of 28 miRNAs (fold change ≥1.5; P ≤ 0.05) were differentially expressed between tumor tissue and adjacent benign tissue of 4 prostate cancer patients.10 top Differentially expressed miRNAs were validated by qRT-PCR using all 20 tissue pairs. Compared to the miRNA profile of non-Chinese populations, the current study showed that miR-23b, miR-220, miR-221, miR-222, and miR-205 maybe common critical therapeutic targets in different populations. The integrated analysis for mRNA microarray and miRNA microarray showed the effects of specifically inhibiting and/or enhancing the function of miRNAs on the gene transcription level. The current studies also identified 15 specific expressed miRNAs in Chinese patients. The clinical feature statistics revealed that miR-374b and miR-19a have significant correlations with clinical-pathological features in Chinese patients.ConclusionsOur findings showed Chinese prostate cancer patients have a common and specific miRNA expression profile compared with non-Chinese populations. The miR-374b is down-regulated in prostate cancer tissue, and it can be identified as an independent predictor of biochemical recurrence-free survival.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.