This Version (2) corrects analysis that was based on the codon encoding Spike position 943; the apparent mutation at 943 was the result of a sequence error. The main conclusions of the paper regarding the mutation in Spike at 614 and recombination still hold. The key difference in version 2 is that we have removed the original figure 6, which was based on the 943 sequencing artifact, and instead moved a figure illustrating recombination that was independent of position 943 from the supplement into the main text. BK
SummaryWe have developed an analysis pipeline to facilitate real-time mutation tracking in SARS-CoV-2, focusing initially on the Spike (S) protein because it mediates infection of human cells and is the target of most vaccine strategies and antibody-based therapeutics. To date we have identified thirteen mutations in Spike that are accumulating. Mutations are considered in a broader phylogenetic context, geographically, and over time, to provide an early warning system to reveal mutations that may confer selective advantages in transmission or resistance to interventions. Each one is evaluated for evidence of positive selection, and the implications of the mutation are explored through structural modeling. The mutation Spike D614G is of urgent concern; it began spreading in Europe in early February, and when introduced to new regions it rapidly becomes the dominant form. Also, we present evidence of recombination between locally circulating strains, indicative of multiple strain infections. These finding have important implications for SARS-CoV-2 transmission, pathogenesis and immune interventions.