Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation. It has been designed to efficiently deliver annotation at scale for all eukaryotic life, and it also provides deep comprehensive annotation for key species. Genomes representing a greater diversity of species are increasingly being sequenced. In response, we have focussed our recent efforts on expediting the annotation of new assemblies. Here, we report the release of the greatest annual number of newly annotated genomes in the history of Ensembl via our dedicated Ensembl Rapid Release platform (http://rapid.ensembl.org). We have also developed a new method to generate comparative analyses at scale for these assemblies and, for the first time, we have annotated non-vertebrate eukaryotes. Meanwhile, we continually improve, extend and update the annotation for our high-value reference vertebrate genomes and report the details here. We have a range of specific software tools for specific tasks, such as the Ensembl Variant Effect Predictor (VEP) and the newly developed interface for the Variant Recoder. All Ensembl data, software and tools are freely available for download and are accessible programmatically.
We have developed a genotyping assay that produces fully phased, unambiguous HLA-E genotyping using Pacific Biosciences' single molecule real-time DNA sequencing. In total 212 cell lines were genotyped, including the panel of 107 established at the 10th International Histocompatibility Workshop. Our results matched the previously known HLA-E genotype in 94 (44.3%) cell lines, in all cases either improving or equalling previous genotyping resolution. Three (1.4%) cells had discrepant HLA-E genotyping data and 115 (54.2%) had no previous HLA-E data. The HLA-E genotypes for four (1.9%) cell lines resulted in a change of zygosity by identifying two distinct haplotypes. We discovered eight novel HLA-E alleles, extended the known reference sequence of seven and confirmed the existence of a further 10.
As the primary genetic determinant of immune recognition of self and non‐self, the hyperpolymorphic HLA genes play key roles in disease association and transplantation. The large, variably sized HLA class II genes have historically been less well characterized than the shorter HLA class I genes. Here, we have used Pacific Biosciences Single Molecule Real‐Time (SMRT®) DNA sequencing to perform four‐field resolution HLA typing of HLA‐DRB1/3/4/5, ‐DQA1, ‐DQB1, ‐DPA1 and ‐DPB1 from a panel of 181 B‐lymphoblastoid cell lines from the International HLA and Immunogenetics Workshops. By interrogating all exons, introns, and the untranslated regions of these important reference cells, we have improved their HLA typing resolution on the IPD‐IMGT/HLA database. We observed widespread non‐coding polymorphism, with over twice as many unique genomic sequences identified compared with coding sequences (CDS). We submitted 263 unique sequences to the IPD‐IMGT/HLA Database, often from multiple cell lines, including 114 confirmations of existing alleles, of which 30 were also extensions to full‐length genomic sequences where only CDS was available previously. A total of 149 novel alleles were identified, largely differing from their closest reference allele sequences by a single nucleotide polymorphism (SNP). However, some highly divergent alleles were deemed to be recombinants, only detectable by full‐length sequencing with long, phased reads. The fourth‐field variation we observed allowed fine mapping of linkage disequilibrium patterns and haplotypes to particular ancestries. This study has highlighted the under‐appreciated non‐coding diversity in HLA class II genes, with potential implications for population genetic and clinical studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.