2021
DOI: 10.1093/molbev/msab199
|View full text |Cite
|
Sign up to set email alerts
|

BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes

Abstract: Methods for evaluating the quality of genomic and metagenomic data are essential to aid genome assembly and to correctly interpret the results of subsequent analyses. BUSCO estimates the completeness and redundancy of processed genomic data based on universal single-copy orthologs. Here we present new functionalities and major improvements of the BUSCO software, as well as the renewal and expansion of the underlying datasets in sync with the OrthoDB v10 release. Among the major novelties, BUSCO now enables phy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

11
2,017
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 3,230 publications
(2,028 citation statements)
references
References 25 publications
11
2,017
0
Order By: Relevance
“…A total of 26,230 genes were functionally annotated. We evaluated the completeness of the predicted gene sets and extent of gene duplication with 4,896 BUSCOs from the Poales database (v10; Manni et al 2021 ), of which 3,930 (80.2%) were complete, indicating a relatively complete genome assembly and gene prediction ( table 1 ). An interesting observation among the complete BUSCO’s was the presence of 1,159 (30%) complete duplicated copies.…”
Section: Resultsmentioning
confidence: 99%
“…A total of 26,230 genes were functionally annotated. We evaluated the completeness of the predicted gene sets and extent of gene duplication with 4,896 BUSCOs from the Poales database (v10; Manni et al 2021 ), of which 3,930 (80.2%) were complete, indicating a relatively complete genome assembly and gene prediction ( table 1 ). An interesting observation among the complete BUSCO’s was the presence of 1,159 (30%) complete duplicated copies.…”
Section: Resultsmentioning
confidence: 99%
“…While congruence is high especially for high-scoring assemblies, truly objective comparisons require reporting of the BUSCO versions, parameters, and lineage datasets used. Our data will enable future large-scale comparisons with results from the recently released BUSCO v5, which includes a new genome assessment strategy that improves efficiency and runtimes (Manni et al 2021). The automated analysis workflow to build and maintain NCBI genome assembly assessment catalogues for selected taxa allows users to build updatable community resources, here exemplified with the A 3 Cat that facilitates surveying of species coverage and data quality for available arthropod assemblies and serves to guide ongoing and future genome generation initiatives.…”
Section: Discussionmentioning
confidence: 99%
“…Finally, SSPACE V2.0 (Boetzer and Pirovano, 2014) (-k 5 -T 25 -g 2) was used to further extend and fill up both contigs and scaffolds. Completeness assessment of the final genome assembly was performed by BUSCO v5.2.2 (Simão et al, 2015;Manni et al, 2021) (e-value ≤1e-3) with the popular actinopterygii_odb10 database.…”
Section: Genome Survey De Novo Assembly and Assessmentmentioning
confidence: 99%
“…This unpublished genome of S. anophthalmus was sequenced by us on both Illumina Hiseq2500 and PacBio Sequel platforms using muscle genomic DNAs, and the final assembly of 1.9 Gb (with a contig N50 of 229.8 kb, a scaffold N50 of 309.9 kb, and prediction of 49,865 protein-coding genes) was assembled by combining the corrected long PacBio reads and the primary assembly from short Illumina reads by DBG2OLC v1.1 (Ye et al, 2016). We also assessed the completeness of these proteincoding gene sets by BUSCO v5.2.2 (Manni et al, 2021).…”
Section: Gene Structure and Function Annotationsmentioning
confidence: 99%