The ongoing SARS-CoV-2 outbreak marks the first time that large amounts of genome sequence data have been generated and made publicly available in near real-time. Early analyses of these data revealed low sequence variation, a finding that is consistent with a recently emerging outbreak, but which raises the question of whether such data are sufficiently informative for phylogenetic inferences of evolutionary rates and time scales. The phylodynamic threshold is a key concept that refers to the point in time at which sufficient molecular evolutionary change has accumulated in available genome samples to obtain robust phylodynamic estimates. For example, before the phylodynamic threshold is reached, genomic variation is so low that even large amounts of genome sequences may be insufficient to estimate the virus’s evolutionary rate and the time scale of an outbreak. We collected genome sequences of SARS-CoV-2 from public databases at 8 different points in time and conducted a range of tests of temporal signal to determine if and when the phylodynamic threshold was reached, and the range of inferences that could be reliably drawn from these data. Our results indicate that by February 2nd 2020, estimates of evolutionary rates and time scales had become possible. Analyses of subsequent data sets, that included between 47 to 122 genomes, converged at an evolutionary rate of about 1.1 × 10−3 subs/site/year and a time of origin of around late November 2019. Our study provides guidelines to assess the phylodynamic threshold and demonstrates that establishing this threshold constitutes a fundamental step for understanding the power and limitations of early data in outbreak genome surveillance.
24The rapid sharing of sequence information as seen throughout the current SARS-CoV-2 25 epidemic, represents an inflection point for genomic epidemiology. Here we describe 26 aspects of coronavirus evolutionary genetics revealed from these data, and provide the first 27 direct RNA sequence of SARS-CoV-2, detailing coronaviral subgenome-length mRNA 28 architecture. 30The ongoing epidemic of 2019 novel coronavirus (now called SARS-CoV-2, causing the 31 disease COVID-19), which originated in Wuhan, China, has been declared a public health 32 emergency of international concern by the World Health Organisation (WHO) [1][2][3][4]. SARS- 33CoV-2 is a positive-sense single-stranded RNA ((+)ssRNA) virus of the Coronaviridae family, 34 with related Betacoronaviruses capable of infecting mammalian and avian hosts, resulting in 35 author/funder. All rights reserved. No reuse allowed without permission.
Phylodynamic models use pathogen genome sequence data to infer epidemiological dynamics. With the increasing genomic surveillance of pathogens, especially during the SARS‐CoV‐2 pandemic, new practical questions about their use are emerging. One such question focuses on the inclusion of un‐sequenced case occurrence data alongside sequenced data to improve phylodynamic analyses. This approach can be particularly valuable if sequencing efforts vary over time. Using simulations, we demonstrate that birth–death phylodynamic models can employ occurrence data to eliminate bias in estimates of the basic reproductive number due to misspecification of the sampling process. In contrast, the coalescent exponential model is robust to such sampling biases, but in the absence of a sampling model it cannot exploit occurrence data. Subsequent analysis of the SARS‐CoV‐2 epidemic in the northwest USA supports these results. We conclude that occurrence data are a valuable source of information in combination with birth–death models. These data should be used to bolster phylodynamic analyses of infectious diseases and other rapidly spreading species in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.