2022
DOI: 10.1093/gigascience/giac006
|View full text |Cite
|
Sign up to set email alerts
|

Assessing species coverage and assembly quality of rapidly accumulating sequenced genomes

Abstract: Background Ambitious initiatives to coordinate genome sequencing of Earth's biodiversity mean that the accumulation of genomic data is growing rapidly. In addition to cataloguing biodiversity, these data provide the basis for understanding biological function and evolution. Accurate and complete genome assemblies offer a comprehensive and reliable foundation upon which to advance our understanding of organismal biology at genetic, species, and ecosystem levels. However, ever-changing sequenci… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 27 publications
(15 citation statements)
references
References 56 publications
0
15
0
Order By: Relevance
“…Therefore, metazoan‐level universal single‐copy orthologs (metazoan USCOs) have been proposed as a core set of nuclear‐encoded protein‐coding genes for species delimitation in Metazoa (Eberle et al, 2020). USCOs are under strong selection for occurring only in single copy within a genome (Feron & Waterhouse, 2022; Waterhouse et al, 2011, 2013). In Metazoa, 978 USCOs were recognized based on a representative selection of 65 high‐quality genomes (Simão et al, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, metazoan‐level universal single‐copy orthologs (metazoan USCOs) have been proposed as a core set of nuclear‐encoded protein‐coding genes for species delimitation in Metazoa (Eberle et al, 2020). USCOs are under strong selection for occurring only in single copy within a genome (Feron & Waterhouse, 2022; Waterhouse et al, 2011, 2013). In Metazoa, 978 USCOs were recognized based on a representative selection of 65 high‐quality genomes (Simão et al, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…Core information derived from publicly available (INSDC) assemblies and updates to the underlying taxonomy (including new taxon registrations) are updated every weekday, while target and status lists supplied by EBP-affiliate projects are included in the next daily release after they have been pushed to the goat-data GitHub repository. This enables GoaT to be integrated into pipelines, such as the DToL production pipeline which uses the latest available estimated genome sizes to plan sequencing effort for each species, and the Arthropoda Assembly Assessment Catalogue 28 at https://evofunvm.dcsr.unil.ch/upcoming_assemblies.html , that uses information about sequencing status. Frequent updates of status lists, in particular, are fundamental to GoaT achieving its objective of helping to reduce duplication of sequencing efforts as the number and pace of large-scale sequencing projects continue to increase.…”
Section: Discussionmentioning
confidence: 99%
“…The inferred size of the ALF genome is 1.3 Gbp with a BUSCO score of 96.08%, suggesting a complete high‐quality assembly (Table 2) (Feron & Waterhouse, 2022; Li et al., 2022; Yan et al., 2021). ALF's genome encodes 21,979 protein‐coding genes, which is in alignment with previous leafhopper genomes (e.g., GWSS has 19,904 genes, and TGLH has 19,642 genes; Li et al., 2022; Zhao et al., 2022).…”
Section: Discussionmentioning
confidence: 99%