2022
DOI: 10.1101/2022.09.27.509674
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Benchmarking strategies for cross-species integration of single-cell RNA sequencing data

Abstract: The emergence of multiple species cell atlases has opened the opportunity to investigate evolutionary relationships at the cell type level and understand similarities and differences between cell types across species. Cross-species integration of single-cell RNA-sequencing data has been a key approach to comparing cell types between species. To robustly identify homologous cell types and species-specific cell populations, it is crucial to obtain reliable and informative integration. However, currently there ar… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
33
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 19 publications
(33 citation statements)
references
References 58 publications
0
33
0
Order By: Relevance
“…We evaluate performance by measuring how well labels can be transferred from zebrafish to frog. In particular, we first integrate the datasets using SATURN and then use the cell-type annotations of cells from a reference species, zebrafish, to train a logistic classifier to predict cell types [29] (Supplementary Note 3). The classifier's performance is then tested on the embeddings of the query species, frog (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…We evaluate performance by measuring how well labels can be transferred from zebrafish to frog. In particular, we first integrate the datasets using SATURN and then use the cell-type annotations of cells from a reference species, zebrafish, to train a logistic classifier to predict cell types [29] (Supplementary Note 3). The classifier's performance is then tested on the embeddings of the query species, frog (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Additionally, while we heere did not observe a benefit in more complex orthologue mapping for cross-species integration, this may be of greater importance when integrating more evolutionary divergent species. Thus, future work could explore the effect of different gene mapping strategies, both as part of data preprocessing 19 as well as in the model internally, such as by enabling flexible gene relationships 5,30 or using gene embeddings 49,52 .…”
Section: Discussion and Outlookmentioning
confidence: 99%
“…These methods enable good integration of batch effects arising by processing similar samples in different laboratories, however, they do not enable sufficient integration when differences between datasets are more substantial due to datasets originating from distinct biological or technical "systems", such as multiple species or sequencing technologies (e.g. cell-nuclei) [16][17][18] (Figure 1a,b). To improve performance different approaches for tuning and increasing batch correction have been proposed, such as Kullback-Leibler divergence (KL) regularization strength tuning 22 , latent space adversarial learning [23][24][25] , and latent space cycle-consistency that was previously used specifically for multi-omic integration [26][27][28] , and for increasing biological preservation the use of the multimodal variational mixture of posteriors (VampPrior) 29 as the prior for the latent space was proposed 30 .…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…To test this, we first collected publicly available scRNA-Seq datasets of developing limbs from axolotls ( Ambystoma mexicanum ) 23 and, species with their developmental stages that are documented to have an AER: humans ( Homo sapiens ) 24, 25 , mice ( Mus musculus ) 26, 27 , chickens ( Gallus gallus ) 28 , and frogs ( Xenopus laevis ) 23, 29 ) ( Figures S1 and S2 and Supplementary Table 1) . Then, using Seurat integration, which has been shown to integrate cross-species datasets with high accuracy 30 , we established a multi-species limb atlas that in total contains 50,248 representative cells from all samples ( Figures 1B and S3 ). Coarse lineage annotation detected various mesodermal and ectodermal populations in the multi-species limb atlas, and finer annotation captured the AER cluster ( Figures 1C and S3 ).…”
Section: Introductionmentioning
confidence: 99%