The human leukocyte antigen (HLA) system is a group of genes coding for proteins that are central to the adaptive immune system and identifying the specific HLA allele combination of a patient is relevant in organ donation, risk assessment of autoimmune and infectious diseases and cancer immunotherapy. However, due to the high genetic polymorphism in this region, HLA typing requires specialized methods.
We investigated the performance of five next-generation-sequencing (NGS) based HLA typing tools with a non-restricted license namely HLA*LA, Optitype, HISAT-genotype, Kourami and STC-Seq. This evaluation was done for the five HLA loci, HLA-A, -B, -C, -DRB1 and -DQB1 using whole-exome sequencing (WES) samples from 829 individuals. The robustness of the tools to lower coverage was evaluated by subsampling and HLA typing 230 WES samples at coverages ranging from 1X to 100X. The typing accuracy was measured across four typing resolutions. Among these, we present two clinically-relevant typing resolutions, which specifically focus on the peptide binding region.
On average, across the five HLA genes, HLA*LA was found to have the highest typing accuracy. For the individual genes, HLA-A, -B and -C, Optitype's typing accuracy was highest and HLA*LA had the highest typing accuracy for HLA-DRB1 and -DQB1.
The tools' robustness to lower coverage data varied widely and further depended on the specific HLA locus. For all class I loci, Optitype had a typing accuracy above 95% (according to the modification of the amino acids in the functionally relevant portion of the protein) at 50X, but increasing the depth of coverage beyond even 100X could still improve the typing accuracy of HISAT-genotype, Kourami, and STC-seq across all five HLA genes as well as HLA*LA's typing accuracy for HLA-DQB1.
HLA typing is also used in studies of ancient DNA (aDNA), which often is based on lower quality sequencing data. Interestingly, we found that Optitype's typing accuracy is not notably impaired by short read length or by DNA damage, which is typical of aDNA, as long as the depth of coverage is sufficiently high.