Monitoring acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genetic diversity and emerging mutations in this ongoing pandemic is crucial for understanding its evolution and assuring the performance of diagnostic tests, vaccines, and therapies against coronavirus disease (COVID-19). This study reports on the amino acid (aa) conservation degree and the global and regional temporal evolution by epidemiological week for each residue of the following four structural SARS-CoV-2 proteins: spike, envelope, membrane, and nucleocapsid. All, 105,276 worldwide SARS-CoV-2 complete and partial sequences from 117 countries available in the Global Initiative on Sharing All Influenza Data (GISAID) from 29 December 2019 to 12 September 2020 were downloaded and processed using an in-house bioinformatics tool. Despite the extremely high conservation of SARS-CoV-2 structural proteins (>99%), all presented aa changes, i.e., 142 aa changes in 65 of the 75 envelope aa, 291 aa changes in 165 of the 222 membrane aa, 890 aa changes in 359 of the 419 nucleocapsid aa, and 2671 changes in 1132 of the 1273 spike aa. Mutations evolution differed across geographic regions and epidemiological weeks (epiweeks). The most prevalent aa changes were D614G (81.5%) in the spike protein, followed by the R203K and G204R combination (37%) in the nucleocapsid protein. The presented data provide insight into the genetic variability of SARS-CoV-2 structural proteins during the pandemic and highlights local and worldwide emerging aa changes of interest for further SARS-CoV-2 structural and functional analysis.
Monitoring SARS-CoV-2’s genetic diversity and emerging mutations in this ongoing pandemic is crucial to understanding its evolution and ensuring the performance of COVID-19 diagnostic tests, vaccines, and therapies. Spain has been one of the main epicenters of COVID-19, reaching the highest number of cases and deaths per 100,000 population in Europe at the beginning of the pandemic. This study aims to investigate the epidemiology of SARS-CoV-2 in Spain and its 18 Autonomous Communities across the six epidemic waves established from February 2020 to January 2022. We report on the circulating SARS-CoV-2 variants in each epidemic wave and Spanish region and analyze the mutation frequency, amino acid (aa) conservation, and most frequent aa changes across each structural/non-structural/accessory viral protein among the Spanish sequences deposited in the GISAID database during the study period. The overall SARS-CoV-2 mutation frequency was 1.24 × 10−5. The aa conservation was >99% in the three types of protein, being non-structural the most conserved. Accessory proteins had more variable positions, while structural proteins presented more aa changes per sequence. Six main lineages spread successfully in Spain from 2020 to 2022. The presented data provide an insight into the SARS-CoV-2 circulation and genetic variability in Spain during the first two years of the pandemic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.