2021
DOI: 10.1101/2021.10.17.464750
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SIMBA: SIngle-cell eMBedding Along with features

Abstract: Recent advances in single cell omics technologies enable the individual or joint profiling of cellular measurements including gene expression, epigenetic features, chromatin structure and DNA sequences. Currently, most single-cell analysis pipelines are cluster-centric, i.e., they first cluster cells into non-overlapping cellular states and then extract their defining genomic features. These approaches assume that discrete clusters correspond to biologically relevant subpopulations and do not explicitly model … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(14 citation statements)
references
References 59 publications
(122 reference statements)
0
14
0
Order By: Relevance
“…As datasets continue to grow in size and complexity, it becomes increasingly important to quantitatively visualize interactions between entities. Future datasets may require multi-localization, where higher-order interactions (e.g., between a ligand and multimeric receptor (22); antibodies, antigens, and cell receptors (52); or single-cell multi-omics datasets (53)) are embedded in a low-dimensional space.…”
Section: Discussionmentioning
confidence: 99%
“…As datasets continue to grow in size and complexity, it becomes increasingly important to quantitatively visualize interactions between entities. Future datasets may require multi-localization, where higher-order interactions (e.g., between a ligand and multimeric receptor (22); antibodies, antigens, and cell receptors (52); or single-cell multi-omics datasets (53)) are embedded in a low-dimensional space.…”
Section: Discussionmentioning
confidence: 99%
“…SIMBA (1.1) (https://github.com/pinellolab/simba) was used to embed individual genes alongside cells in feature space, enabling the characterization of cell type-gene specificity 82 . Cellranger output files filtered_feature_bc_matrix.h5 for each condition were input and consequently subset to only the cells that remained post filtering and were used for downstream analysis (see PBMC Secondary Analysis).…”
Section: Orthogonal Validationmentioning
confidence: 99%
“…However, scBasset requires training of a large neural network model where the number of tasks equals the number of cells and likely will require further optimizations to scale to large datasets. Finally, a recent method called SIMBA uses a graph embedding approach for scRNA-seq, scATAC-seq, and multiome data 6 , where cells, genes, peaks, k -mers, and TF motifs are vertices in the graph for a data set, and edges connect entities (like peaks) that relate to other entities (like cells). Notably for the application of this method to scATAC-seq, the TF motifs must be specified prior to training in order to define the graph and therefore will impact the learned embedding.…”
Section: Mainmentioning
confidence: 99%