22Approximately 5% of the human genome consists of structural variants, which are enriched for 23 genes involved in the immune response and cell-cell interactions. A well-established region of 24 extensive structural variation is the glycophorin gene cluster, comprising three tandemly-25 repeated regions about 120kb in length, carrying the highly homologous genes GYPA, GYPB 26 and GYPE. Glycophorin A and glycophorin B are glycoproteins present at high levels on the 27 surface of erythrocytes, and they have been suggested to act as decoy receptors for viral 28 pathogens. They act as receptors for invasion of a causative agent of malaria, Plasmodium 29 falciparum. A particular complex structural variant (DUP4) that creates a GYPB/GYPA fusion 30 gene is known to confer resistance to malaria. Many other structural variants exist, and remain 31 poorly characterised. Here, we analyse sequences from 6466 genomes from across the world for 32 structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes 33 project cohort, discovering 9 new variants, and characterising a selection using fibre-FISH and 34 breakpoint mapping. We identify variants predicted to create novel fusion genes and a common 35 inversion duplication variant at appreciable frequencies in West Africans. We show that almost 36 all variants can be explained by unequal cross over events (non-allelic homologous 37 recombination, NAHR) and. by comparing the structural variant breakpoints with 38 Page 2 of 32 recombination hotspot maps, show the importance of a particular meiotic recombination 39 hotspot on structural variant formation in this region. 40 41
Background Approximately 5% of the human genome shows common structural variation, which is enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120 kb in length and carrying the highly homologous genes GYPA, GYPB and GYPE. Glycophorin A (encoded by GYPA) and glycophorin B (encoded by GYPB) are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They are receptors for the invasion of the protist parasite Plasmodium falciparum, a causative agent of malaria. A particular complex structural variant, called DUP4, creates a GYPB-GYPA fusion gene known to confer resistance to malaria. Many other structural variants exist across the glycophorin gene cluster, and they remain poorly characterised. Results Here, we analyse sequences from 3234 diploid genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection of these variants using fibre-FISH and breakpoint mapping at the sequence level. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by non-allelic homologous recombination and by comparing the structural variant breakpoints with recombination hotspot maps, confirm the importance of a particular meiotic recombination hotspot on structural variant formation in this region. Conclusions We identify and validate large structural variants in the human glycophorin A-B-E gene cluster which may be associated with different clinical aspects of malaria.
Structural variation in the human genome can affect risk of disease. An example is a complex structural variant of the human glycophorin gene cluster, called DUP4, which is associated with a clinically significant level of protection against severe malaria. The human glycophorin gene cluster harbours at least 23 distinct structural variants, and accurate genotyping of this complex structural variation remains a challenge. Here, we use a polymerase chain reaction‐based strategy to genotype structural variation at the human glycophorin gene cluster, including the alleles responsible for the U– blood group. We validate our approach, based on a triplex paralogue ratio test, on publically available samples from the 1000 Genomes project. We then genotype 574 individuals from a longitudinal birth cohort (Tori‐Bossito cohort) using small amounts of DNA at low cost. Our approach readily identifies known deletions and duplications, and can potentially identify novel variants for further analysis. It will allow exploration of genetic variation at the glycophorin locus, and investigation of its relationship with malaria, in large sample sets at minimal cost, using standard molecular biology equipment.
Structural variation in the human genome can affect risk of disease. An example is a complex structural variant of the human glycophorin gene cluster, called DUP4, which is associated with a clinically-significant level of protection against severe malaria. The human glycophorin gene cluster harbours at least 23 distinct structural variants and accurate genotyping of this complex structural variation remains a challenge. Here, we use a PCR-based strategy to genotype structural variation at the human glycophorin gene cluster. We validate our approach, based on a triplex paralogue ratio test (PRT) combined with junction-fragment specific PCR, on publically-available samples from the 1000 Genomes project. We then genotype a longitudinal birth cohort using small amounts of DNA at low cost. Our approach readily identifies known deletions and duplications, and can potentially identify novel variants for further analysis. It will allow exploration of genetic variation at the glycophorin locus, and investigation of its relationship with malaria, in large sample sets at minimal cost, using standard molecular biology equipment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.