2018
DOI: 10.1101/461954
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fast, sensitive, and accurate integration of single cell data with Harmony

Abstract: The rapidly emerging diversity of single cell RNAseq datasets allows us to characterize the transcriptional behav-1 ior of cell types across a wide variety of biological and clinical conditions. With this comprehensive breadth comes a major 2 analytical challenge. The same cell type across tissues, from different donors, or in different disease states, may appear 3 to express different genes. A joint analysis of multiple datasets requires the integration of cells across diverse conditions. 4 This is particular… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
882
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 630 publications
(888 citation statements)
references
References 36 publications
6
882
0
Order By: Relevance
“…Specifically, we firstly identified 50 principal components as a result and then selected the significant components according to the p-values produced by "ScoreJackStraw" function for further analysis. The batch effects were removed by harmony package [20].…”
Section: Cell Clustering Dimensional Reduction and Visualizationmentioning
confidence: 99%
“…Specifically, we firstly identified 50 principal components as a result and then selected the significant components according to the p-values produced by "ScoreJackStraw" function for further analysis. The batch effects were removed by harmony package [20].…”
Section: Cell Clustering Dimensional Reduction and Visualizationmentioning
confidence: 99%
“…This approach will confound the batch effect with biological differences between cell types or states that are not shared among datasets. Data integration methods such as Canonical Correlation Analysis (CCA; Butler et al , ), Mutual Nearest Neighbours (MNN; Haghverdi et al , ), Scanorama (preprint: Hie et al , ), RISC (preprint: Liu et al , ), scGen (preprint: Lotfollahi et al , ), LIGER (preprint: Welch et al , ), BBKNN (preprint: Park et al , ), and Harmony (preprint: Korsunsky et al , ) have been developed to overcome this issue. While data integration methods can also be applied to simple batch correction problems, we recommend to be wary of over‐correction given the increased degrees of freedom of non‐linear data integration approaches.…”
Section: Introductionmentioning
confidence: 99%
“…Among ZJ01 unique mutations, 10 (NO. [22][23][24][25][26][27][28][29][30][31] were located on the S protein, including 3 samesense mutation, 2 deletion mutation and 5 missense mutation, which led to amino acid changes of Ser596, Gln613, Glu702, Ala771, Ala1015, Pro1053 and Thr1066. Further similarity analysis indicated that the main difference among various coronaviruses located in the Receptor Binding Domain (RBD) region of S1.…”
Section: Sars-cov-2mentioning
confidence: 99%