The rapid accumulation of large-scale single-cell RNA-seq datasets from multiple institutions presents remarkable opportunities for automatically cell annotations through integrative analyses. However, the privacy issue has existed but being ignored, since we are limited to access and utilize all the reference datasets distributed in different institutions globally due to the prohibited data transmission across institutions by data regulation laws. To this end, we present
scPrivacy
, which is the first and generalized automatically single-cell type identification prototype to facilitate single cell annotations in a data privacy-preserving collaboration manner. We evaluated
scPrivacy
on a comprehensive set of publicly available benchmark datasets for single-cell type identification to stimulate the scenario that the reference datasets are rapidly generated and distributed in multiple institutions, while they are prohibited to be integrated directly or exposed to each other due to the data privacy regulations, demonstrating its effectiveness, time efficiency and robustness for privacy-preserving integration of multiple institutional datasets in single cell annotations.
Supporting Information
The supporting information is available online at 10.1007/s11427-022-2224-4. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.