Background: Tumors exhibit extensive intra-tumor heterogeneity, the presence of groups of cellular populations with distinct sets of somatic mutations. This heterogeneity is the result of an evolutionary process, described by a phylogenetic tree. In addition to enabling clinicians to devise patient-specific treatment plans, phylogenetic trees of tumors enable researchers to decipher the mechanisms of tumorigenesis and metastasis. However, the problem of reconstructing a phylogenetic tree T given bulk sequencing data from a tumor is more complicated than the classic phylogeny inference problem. Rather than observing the leaves of T directly, we are given mutation frequencies that are the result of mixtures of the leaves of T. The majority of current tumor phylogeny inference methods employ the perfect phylogeny evolutionary model. The underlying Perfect Phylogeny Mixture (PPM) combinatorial problem typically has multiple solutions. Results:We prove that determining the exact number of solutions to the PPM problem is #P-complete and hard to approximate within a constant factor. Moreover, we show that sampling solutions uniformly at random is hard as well. On the positive side, we provide a polynomial-time computable upper bound on the number of solutions and introduce a simple rejection-sampling based scheme that works well for small instances. Using simulated and real data, we identify factors that contribute to and counteract non-uniqueness of solutions. In addition, we study the sampling performance of current methods, identifying significant biases. Conclusions:Awareness of non-uniqueness of solutions to the PPM problem is key to drawing accurate conclusions in downstream analyses based on tumor phylogenies. This work provides the theoretical foundations for non-uniqueness of solutions in tumor phylogeny inference from bulk DNA samples.
Flux coupling identifies sets of reactions whose fluxes are "coupled" or correlated in genome-scale models. By identified sets of coupled reactions, modelers can 1.) reduce the dimensionality of genome-scale models, 2.) identify reactions that must be modulated together during metabolic engineering, and 3.) identify sets of important enzymes using high-throughput data. We present three computational tools to improve the efficiency, applicability, and biological interpretability of flux coupling analysis.The first algorithm (cachedFCF) uses information from intermediate solutions to decrease the runtime of standard flux coupling methods by 10-100 fold. Importantly, cachedFCF makes no assumptions regarding the structure of the underlying model, allowing efficient flux coupling analysis of models with non-convex constraints.We next developed a mathematical framework (FALCON) that incorporates enzyme activity as continuous variables in genome-scale models. Using data from gene expression and fitness assays, we verified that enzyme sets calculated directly from FALCON models are more functionally coherent than sets of enzymes collected from coupled reaction sets.Finally, we present a method (delete-and-couple) for expanding enzyme sets to allow redundancies and branches in the associated metabolic pathways. The expanded enzyme sets align with known biological pathways and retain functional coherence. The expanded enzyme sets allow pathway-level analyses of genome-scale metabolic models.Together, our algorithms extend flux coupling techniques to enzymatic networks and models with transcriptional regulation and other non-convex constraints. By expanding the efficiency and flexibility of flux coupling, we believe this popular technique will find new applications in metabolic engineering, microbial pathogenesis, and other fields that leverage network modeling.
No abstract
Data management is a critical challenge required to improve the rigor and reproducibility of large projects. Adhering to Findable, Accessible, Interoperable, and Reusable (FAIR) standards provides a baseline for meeting these requirements.Although many existing repositories handle data in a FAIR-compliant manner, there are limited tools in the public domain to handle the metadata burden required to connect data from multi-omic projects that span multiple institutions and are deposited in diverse repositories. One promising approach is the SEEK platform, which allows for diverse metadata and provides an established repository. SEEK is challenged by the assumption of single deposition events where a sample is immutable once entered in the database. This is structured for published data but presents a limitation for ongoing studies where multiple sequential events may occur in a single sample at different sites. To address this issue, we have created a modified wrapper around the SEEK platform that allows for active data management by establishing more discrete sample types that are mutable to permit the expansion of the types of metadata, allowing researchers to track additional information. The use of discrete nodes also converts assays from nodes to edges, creating a network model of the study and more accurately representing the experimental process. With these changes to SEEK, users are able to collect and organize the information that researchers need to improve reusability and reproducibility as well as make data and metadata available to the scientific community through public repositories.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.