Although clinical applications represent the next challenge in single-cell genomics and digital pathology, we are still lacking computational methods to analyse single-cell and pathomics data at a patient level for finding patient trajectories associated with diseases. This is challenging as a single-cell/pathomics data is represented by clusters of cells/structures, which cannot be easily compared with other samples. We here propose PatIent Level analysis with Optimal Transport (PILOT). PILOT uses optimal transport to compute the Wasserstein distance between two single single-cell experiments. This allows us to perform unsupervised analysis at the sample level and to uncover trajectories associated with disease progression. Moreover, PILOT provides a statistical approach to delineate non-linear changes in cell populations, gene expression and tissues structures related to the disease trajectories. We evaluate PILOT and competing approaches in disease single-cell genomics and pathomics studies with up to 1.000 patients/donors and millions of cells or structures. Results demonstrate that PILOT detects disease-associated samples, cells, structures and genes from large and complex single-cell and pathomics data.
Background Massive amounts of data are produced by combining next-generation sequencing with complex biochemistry techniques to characterize regulatory genomics profiles, such as protein–DNA interaction and chromatin accessibility. Interpretation of such high-throughput data typically requires different computation methods. However, existing tools are usually developed for a specific task, which makes it challenging to analyze the data in an integrative manner. Results We here describe the Regulatory Genomics Toolbox (RGT), a computational library for the integrative analysis of regulatory genomics data. RGT provides different functionalities to handle genomic signals and regions. Based on that, we developed several tools to perform distinct downstream analyses, including the prediction of transcription factor binding sites using ATAC-seq data, identification of differential peaks from ChIP-seq data, and detection of triple helix mediated RNA and DNA interactions, visualization, and finding an association between distinct regulatory factors. Conclusion We present here RGT; a framework to facilitate the customization of computational methods to analyze genomic data for specific regulatory genomics problems. RGT is a comprehensive and flexible Python package for analyzing high throughput regulatory genomics data and is available at: https://github.com/CostaLab/reg-gen. The documentation is available at: https://reg-gen.readthedocs.io
Massive amounts of data are produced by combining next-generation sequencing (NGS) with complex biochemistry techniques to characterize regulatory genomics profiles, such as protein-DNA interaction and chromatin accessibility. Interpretation of such high-throughput data typically requires different computation methods. However, existing tools are usually developed for a specific task, which makes it challenging to analyze the data in an integrative manner. We here describe the Regulatory Genomics Toolbox (RGT), a computational library for the integrative analysis of regulatory genomics data. RGT provides different functionalities to handle genomic signals and regions. Based on that, we developed several tools to perform distinct downstream analyses, including the prediction of transcription factor binding sites using ATAC-seq data, identification of differential peaks from ChIP-seq data, and detection of triple helix mediated RNA and DNA interactions, visualization, and finding an association between distinct regulatory factors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.