HLA genotyping via next generation sequencing (NGS) poses challenges for the use of HLA allele names to analyze and discuss sequence polymorphism. NGS will identify many new synonymous and non-coding HLA sequence variants. Allele names identify the types of nucleotide polymorphism that define an allele (non-synonymous, synonymous and non-coding changes), but do not describe how polymorphism is distributed among the individual features (the flanking untranslated regions, exons and introns) of a gene. Further, HLA alleles cannot be named in the absence of antigen-recognition domain (ARD) encoding exons. Here, a system for describing HLA polymorphism in terms of HLA gene features (GFs) is proposed. This system enumerates the unique nucleotide sequences for each GF in an HLA gene, and records these in a GF enumeration notation that allows both more granular dissection of allele-level HLA polymorphism, and the discussion and analysis of GFs in the absence of ARD-encoding exon sequences.
Abbreviations
ARDAntigen Recognition Domain
EMLBEuropean Molecular Biology Laboratory
GFGene Feature
GFEGene Feature Enumeration
HLAHuman Leucocyte Antigen
IHIWInternational HLA and Immunogenetics Workshop
IMGTImMunoGeneTics
NGSNext Generation Sequencing
UTRUntranslated Region