2020
DOI: 10.1093/molbev/msaa314
|View full text |Cite
|
Sign up to set email alerts
|

Phylogenetic Analysis of SARS-CoV-2 Data Is Difficult

Abstract: Numerous studies covering some aspects of SARS-CoV-2 data analyses are being published on a daily basis, including a regularly updated phylogeny on nextstrain.org. Here, we review the difficulties of inferring reliable phylogenies by example of a data snapshot comprising a quality-filtered subset of 8, 736 out of all 16, 453 virus sequences available on May 5, 2020 from gisaid.org. We find that it is difficult to infer a reliable phylogeny on these data due to the large number of sequences in conjunction with … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
159
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 152 publications
(166 citation statements)
references
References 54 publications
6
159
0
1
Order By: Relevance
“…We followed the recommendations of Morel et al . ( 42 ), in which 100 separate maximum likelihood phylogenies were generated using RAxML-NG ( 66 ) and the GTR+G substitution model, such that each reconstruction used a different random starting parsimony tree. The final phylogeny was then obtained from this set using majority rule.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We followed the recommendations of Morel et al . ( 42 ), in which 100 separate maximum likelihood phylogenies were generated using RAxML-NG ( 66 ) and the GTR+G substitution model, such that each reconstruction used a different random starting parsimony tree. The final phylogeny was then obtained from this set using majority rule.…”
Section: Methodsmentioning
confidence: 99%
“…We sought to gain a better understanding of SARS-CoV-2 evolution and to determine whether iSNVs could be used to help resolve phylogenies and transmission clusters. For the 1390 genomes in our study, we constructed a phylogeny using the robust procedure outlined by ( 42 ) ( Fig. 5A ).…”
Section: Within-host Variant Sites Are Phylogenetically Associatedmentioning
confidence: 99%
See 1 more Smart Citation
“…The lack of statistical support for homoplasy for the vast majority of mutations results from the general lack of phylogenetic signal in SARS-CoV-2 sequences, as has been noted previously (Morel et al, 2020). In addition, several other factors may play an important role.…”
Section: Discussionmentioning
confidence: 87%
“…Lack of phylogenetic signal greatly increases the size of the plausible tree set, whilst model misspecification alters the trees included in this set and their relative likelihood values. In this case, the tree with the highest likelihood (the 'best-fitting' tree) is selected, more or less at random, from an immense set of plausible trees with nearly equal support (Morel et al, 2020), which makes it extremely unlikely that this "best" tree represents the true evolutionary history. It is difficult to conclude that some feature of the evolutionary dynamics, such as homoplasy, is supported by the phylogeny if the set of plausible trees includes trees that do and do not support that feature.…”
Section: Introductionmentioning
confidence: 99%