2022
DOI: 10.1101/2022.05.18.492548
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Metadata retrieval from sequence databases with ffq

Abstract: We present a command-line tool, called ffq, for querying metadata from genomic databases.

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
3

Relationship

3
6

Authors

Journals

citations
Cited by 18 publications
(13 citation statements)
references
References 30 publications
0
12
0
1
Order By: Relevance
“…Our open-source Python and command-line program gget enables efficient and easy programmatic access to information stored in a diverse collection of large, public genomic reference databases. gget works alongside existing tools that fetch user-generated sequencing data (Gálvez-Merchán et al, 2022) to replace ineffective, error-prone manual web access during genomic data analysis. While the gget modules were motivated by experience with tedious single-cell RNA-seq data analysis tasks (Supplementary Figure 1), we anticipate their utility for a wide range of bioinformatics tasks.…”
Section: Discussionmentioning
confidence: 99%
“…Our open-source Python and command-line program gget enables efficient and easy programmatic access to information stored in a diverse collection of large, public genomic reference databases. gget works alongside existing tools that fetch user-generated sequencing data (Gálvez-Merchán et al, 2022) to replace ineffective, error-prone manual web access during genomic data analysis. While the gget modules were motivated by experience with tedious single-cell RNA-seq data analysis tasks (Supplementary Figure 1), we anticipate their utility for a wide range of bioinformatics tasks.…”
Section: Discussionmentioning
confidence: 99%
“…For RNA-seq data, we downloaded metaSRA [73] version 1.8 to identify samples associated with potential age and sex information. We then used ffq [74] to fetch sample accession data from the Sequence Read Archive (SRA) [41] to match the sample identifiers used in metaSRA to the run identifiers used in refine.bio. We manually checked these labels as well by reading sample descriptions obtained from SRA.…”
Section: Curation Of Age and Sex Labelsmentioning
confidence: 99%
“…• anndata 0.7.6 [24] • bustools 0.40.0 [10,25] • IGV 2.13.0 [13] • kallisto 0.48.0 [11] • kb-python 0.27.2 [10] • ffq 0.2.1 [26] • gget 0.1.1 [27] • HISAT2 2.2.1 [12] • htslib 1.10 [28]…”
Section: Softwareunclassified