11A novel coronavirus, SARS-CoV-2, has caused a pandemic of COVID-19. The 12 evolutionary trend of the virus genome may have implications for infection control policy 13 but remains obscure. We introduce an estimation of fold change of translational 14 efficiency based on synonymous variant sites to characterize the adaptation of the virus to 15 hosts. The increased translational efficiency of the M and N genes suggests that the 16 population of SARS-CoV-2 benefits from mutations toward favored codons, while the 17 ORF1ab gene has slightly decreased the translational efficiency. In the coding region of 18 the ORF1ab gene upstream of the -1 frameshift site, the decreasing of the translational 19 efficiency has been weakening parallel to the growth of the epidemic, indicating 20 inhibition of synthesis of RNA-dependent RNA polymerase and promotion of replication 21 of the genome. Such an evolutionary trend suggests that multiple infections increased 22 virulence in the absence of social distancing. 23 24 42 genomes of SARS-CoV-2 precludes the traditional methods, which were designed for 43 comparison between distant species, from extracting the subtle trend. Here, we introduce 44 a method based on fold change of translational efficiency (FCTE), which accumulates the 45 effects of synonymous SNV distributed sparsely in the genome of SARS-CoV-2 and 46 characterize the adaptation to hosts in the epidemic in China. 47
RESULTS
48
Density of SNV in the coding regions of SARS-CoV-2 49In 12 ORFs of SARS-CoV-2, there are on average only 0.021 nonsynonymous sites and 50 0.012 synonymous sites per 1000 nt per genome (Fig. 1A). The numbers of synonymous 51 3 and nonsynonymous sites in a coding region are both correlated with the length of the 52 coding region (Fig. 1B). Compare to the average mutation density, ORF1ab:A, which is 53 the coding region of the ORF1ab gene upstream of the -1 frameshift site, contains slightly 54 more synonymous sites, while ORF1ab:B, which is downstrem of the -1 frameshift site, 55 contains less synonymous sites. 57 FCTE measures the effect of mutation on the translation of the residing gene. We 58 estimated FCTE by the fold change of codon usage frequencies of the codon pair before 59 and after a synonymous mutation (Table S1). The codon usage frequency was calibrated 60 by the codon frequency averaged over a repertoire of genes weighted by their expression 61 levels in the type II alveolar (AT2) cells of lung tissue, which is probably the target cells 62 of SARS-CoV-2(15). Two topologies of phylogenetic trees of the genomes of SARS- 63 CoV-2 were examined. The first topology is star-like, in which the central ancestor is the 64 consensus sequence ( Fig. 2A). The second topology is a maximum likelihood tree rooted 65 131 protein starts synthesizing the complementary strand from the 3' end of the genomic RNA 132 and disrupts long distance pairing of RNA secondary structure, preventing the 133 frameshifting in later translation(24). So the ribosome will encounter a stop codon soon 134 and t...