2022
DOI: 10.1093/molbev/msac092
|View full text |Cite
|
Sign up to set email alerts
|

AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era

Abstract: Sequence simulators play an important role in phylogenetics. Simulated data has many applications, such as evaluating the performance of different methods, hypothesis testing with parametric bootstraps, and, more recently, generating data for training machine-learning applications. Many sequence simulation programs exist, but the most feature-rich programs tend to be rather slow, and the fastest programs tend to be feature-poor. Here, we introduce AliSim, a new tool that can efficiently simulate biologically r… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
67
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 53 publications
(67 citation statements)
references
References 39 publications
0
67
0
Order By: Relevance
“…Different R, H, and F were simulated over the trees in the first simulation experiment, while the same R, H, and F were shared among the trees in the second simulation experiment. The alignments were then simulated according to the tree, the GTR model, and the gamma rate using AliSim (38). Each simulated dataset contained 100k bases, regardless of the number of trees m. The proportions of the lengths of each of the m alignments simulated from each of the m trees were the ratios of m random integers drawn from a uniform distribution between 1 and 10.…”
Section: Model Description the Mast Model Consists Of M Classesmentioning
confidence: 99%
“…Different R, H, and F were simulated over the trees in the first simulation experiment, while the same R, H, and F were shared among the trees in the second simulation experiment. The alignments were then simulated according to the tree, the GTR model, and the gamma rate using AliSim (38). Each simulated dataset contained 100k bases, regardless of the number of trees m. The proportions of the lengths of each of the m alignments simulated from each of the m trees were the ratios of m random integers drawn from a uniform distribution between 1 and 10.…”
Section: Model Description the Mast Model Consists Of M Classesmentioning
confidence: 99%
“…To investigate if certain regions in the M. bovis genome have significantly higher values of either SNP density or selective sweep sites, a probabilistic hypothesis test was used to find highly significant genomic regions. For that, we produced a simulated alignment from the original alignment with the tool Alisim using the same alignment length, nucleotide substitution model, and phylogenetic tree topology (created from the previously computed Maximum Likelihood tree; Ly-Trong et al, 2022 ). Alisim next created a random sequence based on the previous specifications and then simulated nucleotide substitutions that were independently added while also conforming to the phylogenetic topology, resulting in a simulated alignment output.…”
Section: Methodsmentioning
confidence: 99%
“…These methods take distance matrices, which are usually derived from sequence alignments, as input and produce a phylogenetic tree as output. Our simulations are implemented mostly in R, using the packages phangorn (Schliep, 2011) and ape (Paradis and Schliep, 2019), and for simulating alignments we use the tool AliSim from the software package IQ-TREE (Ly-Trong et al ., 2021). All implementations can be found in (Collienne, 2022).…”
Section: Supplementary Materialsmentioning
confidence: 99%
“…For simulating sequence alignments we use the tool AliSim from the software package IQ-TREE (Ly-Trong et al, 2021). Sequences of length 1 , 000 are computed for a random Yule-Harding tree on n leaves, assuming an HKY model (Hasegawa et al, 1985) with transition/transversion ratio of 2 and base frequencies 0.1, 0.2, 0.3, 0.4 for A, C, G, and T, respectively.…”
Section: Supplementary Materialsmentioning
confidence: 99%