Spatial transcriptomics technologies are used to profile transcriptomes while preserving spatial information, which enables high-resolution characterization of transcriptional patterns and reconstruction of tissue architecture. Due to the existence of low-resolution spots in recent spatial transcriptomics technologies, uncovering cellular heterogeneity is crucial for disentangling the spatial patterns of cell types, and many related methods have been proposed. Here, we benchmark 18 existing methods resolving a cellular deconvolution task with 50 real-world and simulated datasets by evaluating the accuracy, robustness, and usability of the methods. We compare these methods comprehensively using different metrics, resolutions, spatial transcriptomics technologies, spot numbers, and gene numbers. In terms of performance, CARD, Cell2location, and Tangram are the best methods for conducting the cellular deconvolution task. To refine our comparative results, we provide decision-tree-style guidelines and recommendations for method selection and their additional features, which will help users easily choose the best method for fulfilling their concerns.
We present a novel self-supervised Contrastive LEArning framework for single-cell ribonucleic acid (RNA)-sequencing (CLEAR) data representation and the downstream analysis. Compared with current methods, CLEAR overcomes the heterogeneity of the experimental data with a specifically designed representation learning task and thus can handle batch effects and dropout events simultaneously. It achieves superior performance on a broad range of fundamental tasks, including clustering, visualization, dropout correction, batch effect removal, and pseudo-time inference. The proposed method successfully identifies and illustrates inflammatory-related mechanisms in a COVID-19 disease study with 43 695 single cells from peripheral blood mononuclear cells.
Modern machine learning models towards various tasks with omic data analysis give rise to threats of privacy leakage of patients involved in those datasets. Despite the advances in different privacy technologies, existing methods tend to introduce too much noise, which hampers model accuracy and usefulness. Here, we built a secure and privacy-preserving machine learning (PPML) system by combining federated learning (FL), differential privacy (DP) and shuffling mechanism. We applied this system to analyze data from three sequencing technologies, and addressed the privacy concern in three major tasks of omic data, namely cancer classification with bulk RNA-seq, clustering with single-cell RNA-seq, and the integration of spatial gene expression and tumour morphology with spatial transcriptomics, under three representative deep learning models. We also examined privacy breaches in depth through privacy attack experiments and demonstrated that our PPML-Omics system could protect patients' privacy. In each of these applications, PPML-Omics was able to outperform state-of-the-art systems under the same level of privacy guarantee, demonstrating the versatility of the system in simultaneously balancing the privacy-preserving capability and utility in omic data analysis. Furthermore, we gave the theoretical proof of the privacy-preserving capability of PPML-Omics, suggesting the first mathematically guaranteed model with robust and generalizable empirical performance.
Recombination is one of the essential genetic processes for sexually reproducing organisms, which can happen more frequently in some regions, called recombination hotspots. Although several factors, such as PRDM9 binding motifs, are known to be related to the hotspots, their contributions to the recombination hotspots have not been quantified, and other determinants are yet to be elucidated. Here, we develop a computational method, RHSNet, based on deep learning and signal processing, to identify and quantify the hotspot determinants in a purely data-driven manner, utilizing datasets from various studies, populations, sexes, and species. In addition to being able to identify hotspot regions and the well-known determinants accurately, RHSNet is sensitive to the difference between different PRDM9 alleles and different sexes, and can generalize to PRDM9-lacking species. The cross-sex, cross-population, and cross-species studies suggest that the proposed method has the potential to identify and quantify the evolutionary determinant motifs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.