“…To assess PhyloAln’s capability in mapping sequences/reads into reference alignments across diverse species spanning the tree of life and its performance on contaminated data, we constructed a simulated dataset. This dataset comprises genomes from one bacterium ( Escherichia coli (Hayashi, et al 2001)), one plant ( Arabidopsis thaliana (Cheng, et al 2017)), one fungus ( Saccharomyces cerevisiae (Engel, et al 2022)), two vertebrates ( Danio rerio (Howe, et al 2013) and Homo sapiens (Collins, et al 2004)), eight fruit flies ( Scaptodrosophila lebanonensis (Flynn, et al 2020), Drosophila melanogaster (Hoskins, et al 2015), D. simulans (Wang, et al 2023), D. willistoni (Ranz, et al 2023), D. mojavensis (Kim, et al 2021), D. yakuba (Huang, et al 2022), D. busckii (Renschler, et al 2019), D. pseudoobscura (Barata, et al 2023)) and two other insects ( Tribolium castaneum (Herndon, et al 2020) and Aedes aegypti (Matthews, et al 2018)), sourced from the NCBI RefSeq database (Haft, et al 2023) (Table S1). The phylogeny of these species was obtained from the TimeTree database (Kumar, et al 2022).…”