2023
DOI: 10.1101/2023.03.17.533215
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A machine-readable specification for genomics assays

Abstract: Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries. We present seqspec, a machine-readable specification for libraries produced by genomics assays that facilitates standardization of preprocessing and enables tracking and comparison of genomics assays. The spe… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(13 citation statements)
references
References 18 publications
0
13
0
Order By: Relevance
“…Elements such as barcodes, unique molecular identifiers (UMIs), and genomic features such as cDNA must be accurately identified and appropriately parsed by preprocessing tools to ensure that cataloging, error correcting, and counting are correctly performed. To enable proper element identification and universal preprocessing, we use the seqspec index command (Booeshaghi, Chen, and Pachter 2023) to extract sequenced elements in a tool-specific manner and perform read cataloging against the previously-indexed categories. This indexing strategy correctly identifies and extracts relevant single-cell features in all assay types, a result of the consistent and controlled vocabulary of the seqspec specification.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Elements such as barcodes, unique molecular identifiers (UMIs), and genomic features such as cDNA must be accurately identified and appropriately parsed by preprocessing tools to ensure that cataloging, error correcting, and counting are correctly performed. To enable proper element identification and universal preprocessing, we use the seqspec index command (Booeshaghi, Chen, and Pachter 2023) to extract sequenced elements in a tool-specific manner and perform read cataloging against the previously-indexed categories. This indexing strategy correctly identifies and extracts relevant single-cell features in all assay types, a result of the consistent and controlled vocabulary of the seqspec specification.…”
Section: Resultsmentioning
confidence: 99%
“…The physical-isolation and molecular-capture methods used for an assay are reflected in the structure of reads produced by these methods (Figure 1). The seqspec assay specification (Booeshaghi, Chen, and Pachter 2023) provides a machine-readable format for describing this structure, thereby translating the organizational themes of single-cell genomics into a medium that can, in principle, facilitate automatic and universal preprocessing of single-cell genomics data. While several tools have been developed for preprocessing data from different single-cell RNA-seq assays (He et al 2022; Battenberg et al 2022; Melsted et al 2021), we demonstrate that seqspec (Booeshaghi, Chen, and Pachter 2023), along with kallisto bustools (Melsted et al 2021), kITE (Sina Booeshaghi et al 2022) and snATAK (Sina Booeshaghi, Gao, and Pachter 2023), can in principle be used for preprocessing data from any single-cell genomics assay.…”
Section: Introductionmentioning
confidence: 99%
“…The reference consisted of the Homo Sapiens genome sequence GRCH38.p13 retrieved together with the used gene models from Ensembl 108, extended with the Rhapsody sample tags for Homo sapiens. Cell barcodes were retrieved from 45,46 …”
Section: Methodsmentioning
confidence: 99%
“…Cell barcodes were retrieved from. 45,46 The targeted gene set consisted of the Becton Dickinson Rhapsody Onco-BC Targeted Panel (https://scomix.bd. com/hc/article_attachments/13766899704717) and a specifically selected panel to detect TNF+IL-17A regulated genes in mesothelial cells based on our data described below in the Results section.…”
Section: Single-cell Rna Sequencing (Scrna-seq) Of Patient-derived Me...mentioning
confidence: 99%
“…To date, the community has put significant efforts into documenting and categorizing the library layout for many existing sequencing assays [19, 10, 20] and developing general parsers for such protocols [21, 22, 10, 23]. Of course, these tools, or their relevant components could also be applied to this task, with the user handling the appropriate bookkeeping.…”
Section: Introductionmentioning
confidence: 99%