2022
DOI: 10.1093/genetics/iyac079
|View full text |Cite
|
Sign up to set email alerts
|

BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data

Abstract: Bioinformatic analysis—such as genome assembly quality assessment, alignment summary statistics, relative synonymous codon usage, file format conversion, and processing and analysis—is integrated into diverse disciplines in the biological sciences. Several command-line pieces of software have been developed to conduct some of these individual analyses, but unified toolkits that conduct all these analyses are lacking. To address this gap, we introduce BioKIT, a versatile command line toolkit that has, upon publ… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

3
6

Authors

Journals

citations
Cited by 24 publications
(21 citation statements)
references
References 82 publications
0
21
0
Order By: Relevance
“…To obtain high-quality and adapter-free reads, raw reads were trimmed with Trimmomatic version 0.39 ( 39 ) using the parameters “2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36.” On average, 36 million read pairs passed trimming. Trimmed reads were then assembled with SPAdes version 3.15.2 ( 40 ) using the parameters “–isolate” and “–cov-cutoff auto.” Genome statistics were calculated with BioKIT version 0.0.4 ( 41 ).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…To obtain high-quality and adapter-free reads, raw reads were trimmed with Trimmomatic version 0.39 ( 39 ) using the parameters “2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36.” On average, 36 million read pairs passed trimming. Trimmed reads were then assembled with SPAdes version 3.15.2 ( 40 ) using the parameters “–isolate” and “–cov-cutoff auto.” Genome statistics were calculated with BioKIT version 0.0.4 ( 41 ).…”
Section: Methodsmentioning
confidence: 99%
“…To align the 4,515 single-copy orthologs, MAFFT version 7.402 ( 46 , 47 ) was used along with the parameters “-bl 62 -op 1.0 -maxiterate 1000 -retree 1 -genafpair” ( 41 ). The 4,515 alignments were trimmed with version 1.2.0 of ClipKIT ( 22 ) and then combined into a supermatrix with PhyKIT version 1.5.0 ( 51 ).…”
Section: Methodsmentioning
confidence: 99%
“…Archival instabilities among software threatens the reproducibility of bioinformatics research [ 60 ]. To ensure long-term stability of OrthoSNAP, we implemented previously established rigorous development practices and design principles [ 44 , 52 , 61 , 62 ]. For example, OrthoSNAP features a refactored codebase, which facilitates debugging, testing, and future development.…”
Section: Methodsmentioning
confidence: 99%
“…The resulting supermatrix had 6,378,237 sites (2,846,432 parsimony informative sites). Alignment length and number of parsimony informative sites were calculated using BioKIT, v0.1.2 (Steenwyk et al, 2022). IQ-TREE 2 was used for tree inference (Minh et al, 2020, 2).…”
Section: Methodsmentioning
confidence: 99%