2018
DOI: 10.1093/nar/gky421
|View full text |Cite
|
Sign up to set email alerts
|

CellAtlasSearch: a scalable search engine for single cells

Abstract: Owing to the advent of high throughput single cell transcriptomics, past few years have seen exponential growth in production of gene expression data. Recently efforts have been made by various research groups to homogenize and store single cell expression from a large number of studies. The true value of this ever increasing data deluge can be unlocked by making it searchable. To this end, we propose CellAtlasSearch, a novel search architecture for high dimensional expression data, which is massively parallel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 40 publications
(38 citation statements)
references
References 25 publications
0
37
0
Order By: Relevance
“…Lastly, 4 random seeds were tested for each cutoff and each method to reflect stability. Several other cell querying tools (CellAtlasSearch 3 , scQuery 33 , scMCA 34 ) were not included in our benchmark because they do not support custom reference datasets.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Lastly, 4 random seeds were tested for each cutoff and each method to reflect stability. Several other cell querying tools (CellAtlasSearch 3 , scQuery 33 , scMCA 34 ) were not included in our benchmark because they do not support custom reference datasets.…”
Section: Methodsmentioning
confidence: 99%
“…Analogous to biological sequence analysis 1 , identifying expression similarity to well-curated references via a cell querying algorithm is becoming the first step of annotating newly sequenced cells. Tools have been developed to identify similar cells using approximate cosine distance 2 or LSH Hamming distance 3, 4 calculated from a subset of carefully selected genes. Such intuitive approach is efficient, especially for large-scale data, but may suffer from non-biological variation across datasets, i.e.…”
Section: Main Textmentioning
confidence: 99%
“…There also exist tools that perform single cell similarity search on reference datasets, such as CellAt-lasSearch and scMatch [22,42]. Rather than using marker genes, these methods compare the entire gene expression profile of every single cell to a reference database, using locality-sensitive hashing in the case of CellAtasSearch [22] or Pearson or Spearman correlation in the case of scMatch [42]. These tools do not include functionality for clustering or low-dimensional visualization.…”
Section: Discussionmentioning
confidence: 99%
“…However, the scalability of scmap-cell is limited and is not applicable to extremely large data sets. Srivastava et al [15] have also developed a web service named CellAt-lasSearch that searches existing scRNA-seq experiments using locality-sensitive hashing (LSH) and graphical processing units (GPUs) to accelerate the search. In LSH, expression profiles are hashed into bit vectors, and their similarities are estimated from the Hamming distance between bit vectors calculated by LSH [16].…”
Section: Introductionmentioning
confidence: 99%