The analysis of deletions may reveal evolutionary trends and provide new insight into the surprising variability and rapidly spreading capability that SARS-CoV-2 has shown since its emergence. To understand the factors governing genomic stability, it is important to define the molecular mechanisms of deletions in the viral genome. In this work, we performed a statistical analysis of deletions. Specifically, we analyzed correlations between deletions in the SARS-CoV-2 genome and repetitive elements and documented a significant association of deletions with runs of identical (poly-) nucleotides and direct repeats. Our analyses of deletions in the accessory genes of SARS-CoV-2 suggested that there may be a hypervariability in ORF7A and ORF8 that is not associated with repetitive elements. Such recurrent search in a “sequence space” of accessory genes (that might be driven by natural selection) did not yet cause increased viability of the SARS-CoV-2 variants. However, deletions in the accessory genes may ultimately produce new variants that are more successful compared to the viral strains with the conventional architecture of the SARS-CoV-2 accessory genes.
Background Accessory proteins have diverse roles in coronavirus pathobiology. One of them in SARS-CoV (the causative agent of the severe acute respiratory syndrome outbreak in 2002–2003) is encoded by the open reading frame 8 (ORF8). Among the most dramatic genomic changes observed in SARS-CoV isolated from patients during the peak of the pandemic in 2003 was the acquisition of a characteristic 29-nucleotide deletion in ORF8. This deletion cause splitting of ORF8 into two smaller ORFs, namely ORF8a and ORF8b. Functional consequences of this event are not entirely clear. Results Here, we performed evolutionary analyses of ORF8a and ORF8b genes and documented that in both cases the frequency of synonymous mutations was greater than that of nonsynonymous ones. These results suggest that ORF8a and ORF8b are under purifying selection, thus proteins translated from these ORFs are likely to be functionally important. Comparisons with several other SARS-CoV genes revealed that another accessory gene, ORF7a, has a similar ratio of nonsynonymous to synonymous mutations suggesting that ORF8a, ORF8b, and ORF7a are under similar selection pressure. Conclusions Our results for SARS-CoV echo the known excess of deletions in the ORF7a-ORF7b-ORF8 complex of accessory genes in SARS-CoV-2. A high frequency of deletions in this gene complex might reflect recurrent searches in “functional space” of various accessory protein combinations that may eventually produce more advantageous configurations of accessory proteins similar to the fixed deletion in the SARS-CoV ORF8 gene.
Nucleotide substitutions in protein-coding genes can be divided into synonymous (S) and non-synonymous (N) ones that alter amino acids (including nonsense mutations causing stop codons). The S substitutions are expected to have little effect on function. The N substitutions almost always are affected by strong purifying selection that eliminates them from evolving populations. However, additional mutations of nearby bases can modulate the deleterious effect of single N substitutions and, thus, could be subjected to the positive selection. This effect has been demonstrated for mutations in the serine codons, stop codons and double N substitutions in prokaryotes. In all abovementioned cases, a novel technique was applied that allows elucidating the effects of selection on double substitutions considering mutational biases. Here, we applied the same technique to study double N substitutions in eukaryotic lineages of primates and yeast. We identified markedly fewer cases of purifying selection relative to prokaryotes and no evidence of codon double substitutions under positive selection. This is consistent with previous studies of serine codons in primates and yeast. In general, the obtained results strongly suggest that there are major differences between studied pro- and eukaryotes; double substitutions in primates and yeasts largely reflect mutational biases and are not hallmarks of selection. This is especially important in the context of detection of positive selection in codons because it has been suggested that multiple mutations in codons cause false inferences of lineage-specific site positive selection. It is likely that this concern is applicable to previously studied prokaryotes but not to primates and yeasts where markedly fewer double substitutions are affected by positive selection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.