2023
DOI: 10.1186/s12859-023-05527-2
|View full text |Cite
|
Sign up to set email alerts
|

EasyCGTree: a pipeline for prokaryotic phylogenomic analysis based on core gene sets

Dao-Feng Zhang,
Wei He,
Zongze Shao
et al.

Abstract: Background Genome-scale phylogenetic analysis based on core gene sets is routinely used in microbiological research. However, the techniques are still not approachable for individuals with little bioinformatics experience. Here, we present EasyCGTree, a user-friendly and cross-platform pipeline to reconstruct genome-scale maximum-likehood (ML) phylogenetic tree using supermatrix (SM) and supertree (ST) approaches. Results EasyCGTree was implemented… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
9

Relationship

4
5

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 37 publications
0
9
0
Order By: Relevance
“…In order to assess the phylotaxonomy of the family ‘ Rhodobacteraceae ’ and assign the 12 isolates into reasonable taxonomic positions, the program hmmsearch from the hmmer software package (http://hmmer.org/) was used for homologous gene calling of the four gene sets mentioned above, Clustal Omega [38] for multiple sequence alignments, trimAl [39] for trimming and conserved region selection, FastTree [40] and iq-tree [41] for phylogeny inference, and consense from the Phylip software package [42] was used to summarize a CS tree from gene trees. This workflow was implemented automatically using the pipeline EasyCGTree (https://github.com/zdf1987/EasyCGTree4) [37]. In this study, 16 different combinations of the four gene sets (rp1, rp2, bac120 and rhodo268), two tree-making approaches (SM and CS) and two phylogeny inference tools (FastTree and iq-tree ) were used to perform phylogenomic analysis.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to assess the phylotaxonomy of the family ‘ Rhodobacteraceae ’ and assign the 12 isolates into reasonable taxonomic positions, the program hmmsearch from the hmmer software package (http://hmmer.org/) was used for homologous gene calling of the four gene sets mentioned above, Clustal Omega [38] for multiple sequence alignments, trimAl [39] for trimming and conserved region selection, FastTree [40] and iq-tree [41] for phylogeny inference, and consense from the Phylip software package [42] was used to summarize a CS tree from gene trees. This workflow was implemented automatically using the pipeline EasyCGTree (https://github.com/zdf1987/EasyCGTree4) [37]. In this study, 16 different combinations of the four gene sets (rp1, rp2, bac120 and rhodo268), two tree-making approaches (SM and CS) and two phylogeny inference tools (FastTree and iq-tree ) were used to perform phylogenomic analysis.…”
Section: Methodsmentioning
confidence: 99%
“…Subsequently, amino acid sequences of each core gene cluster were gathered up into a single file from the proteomes of related genomes. Finally, the Perl (http://hmmer.org/) script ‘BuildHMM.pl’ in the EasyCGTree package (https://github.com/zdf1987/EasyCGTree4) [37] was used to generate profile HMM files for the rhodo268 gene set.…”
Section: Methodsmentioning
confidence: 99%
“…Genome-based phylogeny of the supermatrix approach from protein sequences of the bac120 gene set (120 single-copy genes prevalent in Bacteria ) was reconstructed by using EasyCGTree version 4.1 (https://github.com/zdf1987/EasyCGTree4) [3, 33]. The bac120 tree indicated that strains WL0004 T and XHP0148 T clustered together with R. marina CGMCC 1.9108 T and were close to the cluster formed by R. alba 1NDH52C T and R. pomeroyi DSS-3 T ().…”
Section: Genome Featuresmentioning
confidence: 99%
“…As for the phylogenetic analysis based on the bac120 gene set (Fig. 3), 64 genomes (45 from type strains, one outgroup, and 18 from type strains) were used as an input of the pipeline EasyCGTree [39], and two isolates that shared 16S rRNA gene identities of >95 % against strain WL0086 T were excluded because of a low level of completeness. The genome of Opitutaceae bacterium AH-707-J18 (GCA_028257465.1) was derived from single-cell amplified genome sequencing with a completeness of 37.64 %, while that of uncultured bacterium FRAM_WSC20_S10_SRF_bin_82 was derived from metagenome sequencing (GCA_949373025.1) with a completeness of 64.19 %.…”
Section: Genome Featuresmentioning
confidence: 99%