2018
DOI: 10.1038/s41467-018-05083-x
|View full text |Cite
|
Sign up to set email alerts
|

Detection and removal of barcode swapping in single-cell RNA-seq data

Abstract: Barcode swapping results in the mislabelling of sequencing reads between multiplexed samples on patterned flow-cell Illumina sequencing machines. This may compromise the validity of numerous genomic assays; however, the severity and consequences of barcode swapping remain poorly understood. We have used two statistical approaches to robustly quantify the fraction of swapped reads in two plate-based single-cell RNA-sequencing datasets. We found that approximately 2.5% of reads were mislabelled between samples o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

3
210
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 260 publications
(214 citation statements)
references
References 21 publications
3
210
1
Order By: Relevance
“…Barcode swapping and chimera formation-Swapping of droplet barcode between transcripts during mixed-template PCR amplification via formation of heteroduplex/chimeric molecules [5][6][7], and/or on the flowcell during sequencing [11], will spread transcripts across droplets and generate a background.…”
Section: Exhibitmentioning
confidence: 99%
“…Barcode swapping and chimera formation-Swapping of droplet barcode between transcripts during mixed-template PCR amplification via formation of heteroduplex/chimeric molecules [5][6][7], and/or on the flowcell during sequencing [11], will spread transcripts across droplets and generate a background.…”
Section: Exhibitmentioning
confidence: 99%
“…For droplet-based scRNA-seq technologies such as 10X Genomics [35], the DropletUtils package can be used to perform key quality control tasks such as distinguishing empty droplets from cells [54] and reducing the effects of barcode swapping [55]. The scater [56] package automates the calculation of a number of key quality control metrics.…”
Section: Cell and Gene Quality Controlmentioning
confidence: 99%
“…Despite recent attempts to computationally estimate the rate of sample index hopping in plate-based scRNA-seq data (Larsson et al, 2018;Griffiths et al, 2018), no statistical model of index hopping for droplet-based scRNA-seq data has yet been proposed. Consequently, current computational methods can neither accurately estimate the underlying rate of index hopping nor adequately remove the resulting phantom molecules in droplet-based scRNA-seq data.…”
Section: Précismentioning
confidence: 99%
“…The generative probabilistic model we propose starts with the observation that each cDNA fragment, in addition to its sample barcode index, has a cell barcode and a unique molecular identifier (UMI), and maps to a specific gene. As has been suggested previously (Griffiths et al, 2018), we make the assumption that any particular cell-UMI-gene combination (hereafter referred to as CUG) is so unlikely that it cannot arise independently in any two different samples (Section 1). Accordingly, each CUG would represent one unique molecule and all sequencing reads with the same combination would correspond to PCR amplification Observed hopping from S2 to S1…”
Section: Précismentioning
confidence: 99%