2023
DOI: 10.12688/wellcomeopenres.18658.1
|View full text |Cite
|
Sign up to set email alerts
|

Genomes on a Tree (GoaT): A versatile, scalable search engine for genomic and sequencing project metadata across the eukaryotic tree of life

Abstract: As genomic data transform our understanding of biodiversity, the Earth BioGenome Project (EBP) has set a goal of generating reference quality genome assemblies for all ~1.9 million described eukaryotic taxa. Meeting this goal requires coordination among many individual regional and taxon-focussed projects working under the EBP umbrella. Large-scale sequencing projects require ready access to validated genome-relevant metadata, such as genome sizes and karyotypes, but these data are dispersed across the literat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 194 publications
(14 citation statements)
references
References 20 publications
0
14
0
Order By: Relevance
“…Following contaminant screening, 63,023 reads (0.46%) were removed (Table S1). The k-mer based genome size estimation was 2.31 Gb from 109.96 Gb of read data (47.6x), double the 1.1 Gb GoaT estimation (Challis et al . 2023) (Table S1).…”
Section: Resultsmentioning
confidence: 99%
“…Following contaminant screening, 63,023 reads (0.46%) were removed (Table S1). The k-mer based genome size estimation was 2.31 Gb from 109.96 Gb of read data (47.6x), double the 1.1 Gb GoaT estimation (Challis et al . 2023) (Table S1).…”
Section: Resultsmentioning
confidence: 99%
“…The goal of the EBP is to create a global network of biodiversity genomics researchers that share a mission to produce a database of openly accessible, standardised, and complete reference resources that span the whole eukaryotic phylogenetic tree. The project has a three-phase approach and to date (Phase I) has produced ~1,213 reference genomes for species across ~1,010 genera 16 . However the rate of production is fast increasing, for instance in 2022 over 316 reference genomes were produced and in the coming years the rate is estimated to increase by at least 10 fold.…”
Section: The State Of Reference Genome Production Todaymentioning
confidence: 99%
“…We developed the current portal rapidly to support the goals of the pilot test, but it will be continually and iteratively improved to enhance usability, for example by potentially adding species imagery and distribution ranges, Ensembl 38 and community annotations, interactive geographic map searches, and cross referencing to key resources such as the Global Biodiversity Information Facility (https://www.gbif.org/) and climate data. Progress data is continuously shared through the portal’s public tracking pages (https://portal.erga-biodiversity.eu/status_tracking) and the GoAT database 16 https://goat.genomehubs.org/projects/ERGA-PIL).…”
Section: Introductionmentioning
confidence: 99%
“…These projects (or even the nodes within a project) are operationally independent and carry out all of the steps of the genome sequencing process autonomously: from the collection of the biological samples to the production of the genome assembly. The EBP uses Genomes on a Tree (GoaT, (7)) for coordination. GoaT is a centralized resource sponsored by Tree of Life programme (https://www.sanger.ac.uk/programme/tree-of-life/) that collates observed and estimated genome-relevant metadata—including genome sizes and karyotypes—for eukaryotic species, and that also holds declarations of current and planned activity across the EBP nodes.…”
Section: Introductionmentioning
confidence: 99%