In our previous work, we developed an automated tool, AutoVEM, for real-time monitoring the candidate key mutations and epidemic trends of SARS-CoV-2. In this research, we further developed AutoVEM into AutoVEM2. AutoVEM2 is composed of three modules, including call module, analysis module, and plot module, which can be used modularly or as a whole for any virus, as long as the corresponding reference genome is provided. Therefore, it’s much more flexible than AutoVEM. Here, we analyzed three existing viruses by AutoVEM2, including SARS-CoV-2, HBV and HPV-16, to show the functions, effectiveness and flexibility of AutoVEM2. We found that the N501Y locus was almost completely linked to the other 16 loci in SARS-CoV-2 genomes from the UK and Europe. Among the 17 loci, 5 loci were on the S protein and all of the five mutations cause amino acid changes, which may influence the epidemic traits of SARS-CoV-2. And some candidate key mutations of HBV and HPV-16, including T350G of HPV-16 and C659T of HBV, were detected. In brief, we developed a flexible automated tool to analyze candidate key mutations and epidemic trends for any virus, which would become a standard process for virus analysis based on genome sequences in the future.
SARS-CoV-2 has been spreading rapidly since 2019 and has produced large-scale mutations in the genomes. Differences in gene sequences may lead to changes in protein structure and traits, which would have a great impact on the epidemiological characteristics. In this study, we selected the key mutations of SARS-CoV-2, including D614G and A222V of S protein and Q57H of ORF3a protein, to conduct molecular dynamics simulation and analysis on the structures of the mutant proteins. The results suggested that D614G improved the stability of S protein, while A222V enhanced the ability of protein to react with the outside environment. Q57H enhanced the structural flexibility of ORF3a protein. Our findings could complete the mechanistic link between genotype--phenotype--epidemiological characteristics in the study of SARS-CoV-2. We also found no significant changes in the antigenicity of S protein, ORF3a protein and their mutants, which provides reference for vaccine development and application.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is continuously evolving, bringing great challenges to the control of the virus. In the present study, we investigated the characteristics of SARS-CoV-2 within-host diversity of human hosts and its implications for immune evasion using about 2,00,000 high-depth next-generation genome sequencing data of SARS-CoV-2. A total of 44% of the samples showed within-host variations (iSNVs), and the average number of iSNVs in the samples with iSNV was 1.90. C-to-U is the dominant substitution pattern for iSNVs. C-to-U/G-to-A and A-to-G/U-to-C preferentially occur in 5′-CG-3′ and 5′-AU-3′ motifs, respectively. In addition, we found that SARS-CoV-2 within-host variations are under negative selection. About 15.6% iSNVs had an impact on the content of the CpG dinucleotide (CpG) in SARS-CoV-2 genomes. We detected signatures of faster loss of CpG-gaining iSNVs, possibly resulting from zinc-finger antiviral protein-mediated antiviral activities targeting CpG, which could be the major reason for CpG depletion in SARS-CoV-2 consensus genomes. The non-synonymous iSNVs in the
S
gene can largely alter the S protein’s antigenic features, and many of these iSNVs are distributed in the amino-terminal domain (NTD) and receptor-binding domain (RBD). These results suggest that SARS-CoV-2 interacts actively with human hosts and attempts to take different evolutionary strategies to escape human innate and adaptive immunity. These new findings further deepen and widen our understanding of the within-host evolutionary features of SARS-CoV-2.
IMPORTANCE
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative pathogen of the coronavirus disease 2019, has evolved rapidly since it was discovered. Recent studies have pointed out that some mutations in the SARS-CoV-2 S protein could confer SARS-CoV-2 the ability to evade the human adaptive immune system. In addition, it is observed that the content of the CpG dinucleotide in SARS-CoV-2 genome sequences has decreased over time, reflecting the adaptation to the human host. The significance of our research is revealing the characteristics of SARS-CoV-2 within-host diversity of human hosts, identifying the causes of CpG depletion in SARS-CoV-2 consensus genomes, and exploring the potential impacts of non-synonymous within-host variations in the
S
gene on immune escape, which could further deepen and widen our understanding of the evolutionary features of SARS-CoV-2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.