Background
The prevalence and mortality of the outbreak of the COVID-19 pandemic show marked geographic variation. The presence of several subtypes of the coronavirus and the genetic differences in the populations could condition that variation. Thus, the objective of this study was to propose variants in genes that encode proteins related to the SARS-CoV-2 entry into the host cells as possible targets for genetic associations studies.
Methods
The allelic frequencies of the polymorphisms in the
ACE2
,
TMPRSS2
,
TMPRSS11A
, cathepsin L (
CTSL
), and elastase (
ELANE
) genes were obtained in four populations from the American, African, European, and Asian continents reported in the 1000 Genome Project. Moreover, we evaluated the potential biological effect of these variants using different web-based tools.
Results
In the coding sequences of these genes, we detected one probably-damaging polymorphism located in the
TMPRSS2
gene (rs12329760) that produces a change of amino acid. Furthermore, forty-eight polymorphisms with possible functional consequences were detected in the non-coding sequences of the following genes: three in
ACE2
, seventeen in
TMPRSS2
, ten in
TMPRSS11A
, twelve in
ELANE
, and six in
CTSL
. These polymorphisms produce binding sites for transcription factors and microRNAs. The minor allele frequencies of these polymorphisms vary in each community; indeed, some of them are high in specific populations.
Conclusion
In summary, using data of the 1000 Genome Project and web-based tools, we propose some polymorphisms, which, depending on the population, could be used for genetic association studies.