BackgroundPrevious studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown.ResultsBy using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity.ConclusionWe provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-2745-8) contains supplementary material, which is available to authorized users.
Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome.
Although a few studies have reported the effects of several polymorphisms on major adverse cardiovascular events (MACE) in patients with acute coronary syndromes (ACS) and those undergoing percutaneous coronary intervention (PCI), these genotypes account for only a small fraction of the variation and evidence is insufficient. This study aims to identify new genetic variants associated with MACE end point during the 18-month follow-up period by a two-stage large-scale sequencing data, including high-depth whole exome sequencing of 168 patients in the discovery cohort and high-depth targeted sequencing of 1793 patients in the replication cohort. We discovered eight new genotypes and their genes associated with MACE in patients with ACS, including MYOM2 (rs17064642), WDR24 (rs11640115), NECAB1 (rs74569896), EFR3A (rs4736529), AGAP3 (rs75750968), ZDHHC3 (rs3749187), ECHS1 (rs140410716), and KRTAP10-4 (rs201441480). Notably, the expressions of MYOM2 and ECHS1 are downregulated in both animal models and patients with phenotypes related to MACE. Importantly, we developed the first superior classifier for predicting 18-month MACE and achieved high predictive performance (AUC ranged between 0.92 and 0.94 for three machine-learning methods). Our findings shed light on the pathogenesis of cardiovascular outcomes and may help the clinician to make a decision on the therapeutic intervention for ACS patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.