The accurate mapping of reads that span splice junctions is a critical component of all analytic techniques that work with RNA-seq data. We introduce a second generation splice detection algorithm, MapSplice, whose focus is high sensitivity and specificity in the detection of splices as well as CPU and memory efficiency. MapSplice can be applied to both short (<75 bp) and long reads (≥75 bp). MapSplice is not dependent on splice site features or intron length, consequently it can detect novel canonical as well as non-canonical splices. MapSplice leverages the quality and diversity of read alignments of a given splice to increase accuracy. We demonstrate that MapSplice achieves higher sensitivity and specificity than TopHat and SpliceMap on a set of simulated RNA-seq data. Experimental studies also support the accuracy of the algorithm. Splice junctions derived from eight breast cancer RNA-seq datasets recapitulated the extensiveness of alternative splicing on a global level as well as the differences between molecular subtypes of breast cancer. These combined results indicate that MapSplice is a highly accurate algorithm for the alignment of RNA-seq reads to splice junctions. Software download URL: http://www.netlab.uky.edu/p/bioinfo/MapSplice.
We report a high-quality draft sequence of the genome of the horse (Equus caballus). The genome is relatively repetitive, but has little segmental duplication. Chromosomes appear to have undergone few historical rearrangements – 48% of equine chromosomes show conserved synteny to a single human chromosome. Equine chromosome 11 is shown to have an evolutionary novel centromere devoid of centromeric satellite DNA, suggesting that centromeric function may arise prior to satellite repeat accumulation. Linkage disequilibrium, showing the influences of early domestication of large herds of female horses, is intermediate in length between dog and human, and there is long-range haplotype sharing among breeds.
While the phenomenon of polyadenylation has been well-studied, the dynamics of poly(A) tail size and its impact on transcript function and cell biology are less well-appreciated. The goal of this review is to encourage readers to view the poly(A) tail as a dynamic, changeable aspect of a transcript rather than a simple static entity that marks the 3′ end of an mRNA. This could open up new angles of regulation in the post-transcriptional control of gene expression throughout development, differentiation and cancer.
Previously, we have shown that horses could be divided into susceptible and resistant groups based on an in vitro assay using dual-color flow cytometric analysis of CD3 ؉ T cells infected with equine arteritis virus (EAV). Here, we demonstrate that the differences in in vitro susceptibility of equine CD3؉ T lymphocytes to EAV infection have a genetic basis. To investigate the possible hereditary basis for this trait, we conducted a genome-wide association study (GWAS) to compare susceptible and resistant phenotypes. Testing of 267 DNA samples from four horse breeds that had a susceptible or a resistant CD3 ؉ T lymphocyte phenotype using both Illumina Equine SNP50 BeadChip and Sequenom's MassARRAY system identified a common, genetically dominant haplotype associated with the susceptible phenotype in a region of equine chromosome 11 (ECA11), positions 49572804 to 49643932. The presence of a common haplotype indicates that the trait occurred in a common ancestor of all four breeds, suggesting that it may be segregated among other modern horse breeds. Biological pathway analysis revealed several cellular genes within this region of ECA11 encoding proteins associated with virus attachment and entry, cytoskeletal organization, and NF-B pathways that may be associated with the trait responsible for the in vitro susceptibility/resistance of CD3 ؉ T lymphocytes to EAV infection. The data presented in this study demonstrated a strong association of genetic markers with the trait, representing de facto proof that the trait is under genetic control. To our knowledge, this is the first GWAS of an equine infectious disease and the first GWAS of equine viral arteritis.Equine arteritis virus (EAV) is a small enveloped virus with a positive-sense, single-stranded RNA genome of 12.7 kb and belongs to the family Arteriviridae (genus Arterivirus, order Nidovirales) (16,64). EAV is the causal agent of equine viral arteritis (EVA), a disease of equids (12, 13). The vast majority of EAV infections are inapparent or subclinical (5, 6, 68). However, some acutely infected horses may develop any combination of the following clinical signs: pyrexia, depression, anorexia, dependent edema (scrotum, ventral trunk, and limbs), conjunctivitis, lacrimation and swelling around the eyes (periorbital or supraorbital edema), respiratory distress, urticaria, and leukopenia (5, 6). During natural outbreaks of the disease, the virus can cause abortion in pregnant mares, with abortion rates varying from 10 to 71%, depending on the virus strain (5, 6). Following EAV infection, a variable proportion of stallions (30 to 70%) can become persistently infected and continuously shed the virus in their semen (55,68). The mechanism of persistence of EAV in the male reproductive tract is not clear. However, studies have established that persistence of EAV in stallions is testosterone dependent (42,52). Moreover, the prevalences of EAV infection differ markedly among different breeds of horses (68), strengthening the assumption of genetic influence on susceptibili...
SummaryThe horse, like the majority of animal species, has a limited amount of species-specific expressed sequence data available in public databases. As a result, structural models for the majority of genes defined in the equine genome are predictions based on ab initio sequence analysis or the projection of gene structures from other mammalian species. The current study used Illumina-based sequencing of messenger RNA (RNA-seq) to help refine structural annotation of equine protein-coding genes and for a preliminary assessment of gene expression patterns. Sequencing of mRNA from eight equine tissues generated 293 758 105 sequence tags of 35 bases each, equalling 10.28 gbp of total sequence data. The tag alignments represent approximately 207· coverage of the equine mRNA transcriptome and confirmed transcriptional activity for roughly 90% of the protein-coding gene structures predicted by Ensembl and NCBI. Tag coverage was sufficient to refine the structural annotation for 11 356 of these predicted genes, while also identifying an additional 456 transcripts with exon/intron features that are not listed by either Ensembl or NCBI. Genomic locus data and intervals for the protein-coding genes predicted by the Ensembl and NCBI annotation pipelines were combined with 75 116 RNA-seq-derived transcriptional units to generate a consensus equine protein-coding gene set of 20 302 defined loci. Gene ontology annotation was used to compare the functional and structural categories of genes expressed in either a tissue-restricted pattern or broadly across all tissue samples.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.