Background
The uterine cervix has an important role in female reproductive health, but not much is known about the genetic determinants of cervical biology and pathology. Genome-wide association studies (GWAS) with increasing sample sizes have reported a few genetic associations for cervical cancer. However, GWAS is only the first step in mapping the genetic susceptibility and thus, the underlying biology in cervical cancer and other cervical phenotypes is still not entirely understood. Here, we use data from large biobanks to characterise the genetics of cervical phenotypes (including cervical cancer) and leverage latest computational methods and gene expression data to refine the association signals for cervical cancer.
Methods
Using Estonian Biobank and FinnGen data, we characterise the genetic signals associated with cervical ectropion (10,162 cases/151,347 controls), cervicitis (19,285/185,708) and cervical dysplasia (14,694/150,563). We present the results from the largest trans-ethnic GWAS meta-analysis of cervical cancer, including up to 9,229 cases and 490,304 controls from Estonian Biobank, the FinnGen study, the UK Biobank and Biobank Japan. We combine GWAS results with gene expression data and chromatin regulatory annotations in HeLa cervical carcinoma cells to propose the most likely candidate genes and causal variants for every locus associated with cervical cancer. We further dissect the HLA association with cervical pathology using imputed data on alleles and amino acid polymorphisms.
Results
We report a single associated locus on 2q13 for both cervical ectropion (rs3748916, p=5.1 x 10-16) and cervicitis (rs1049137, p=3.9 x 10-10), and five signals for cervical dysplasia - 6p21.32 (rs1053726, p=9.1 x 10-9; rs36214159, 1.6 x 10-22), 2q24.1 (rs12611652, p=3.2 x 10-9) near DAPL1, 2q13 ns1049137, p=6.4 x 10-9) near PAX8, and 5p15.33 (rs6866294, p=2.1 x 10-9), downstream of CLPTM1L. We identify five loci associated with cervical cancer in the trans-ethnic meta-analysis: 1p36.12 (rs2268177, p= 3.1 x 10-8), 2q13 (rs4849177, p=9.4 x 10-15), 5p15.33 (rs27069, p=1.3 x 10-14), 17q12 (rs12603332, p=1.2 x 10-9), and 6p21.32 (rs35508382, p=1.0 x 10-39). Joint analysis of dysplasia and cancer datasets revealed an association on chromosome 19 (rs425787, p=3.5 x 10-8), near CD70.
Conclusions
Our results map PAX8/PAX8-AS1, LINC00339, CDC42, CLPTM1L, HLA-DRB1, HLA-B, and GSDMB as the most likely candidate genes for cervical cancer, which provides novel insight into cervical cancer pathogenesis and supports the role of genes involved in reproductive tract development, immune response and cellular proliferation/apoptosis. We further show that PAX8/PAX8-AS1 has a central role in cervical biology and pathology, as it was associated with all analysed phenotypes. The detailed characterisation of association signals, together with mapping of causal variants and genes offers valuable leads for further functional studies.