Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Motivation Genomic biotechnology has rapidly advanced, allowing for the inference and modification of genetic and epigenetic information at the single-cell level. While these tools hold enormous potential for basic and clinical research, they also raise difficult issues of how to design studies to deploy them most effectively. In designing a genomic study, a modern researcher might combine many sequencing modalities and sampling protocols, each with different utility, costs, and other tradeoffs. This is especially relevant for studies of somatic variation, which may involve highly heterogeneous cell populations whose differences can be probed via an extensive set of biotechnological tools. Efficiently deploying genomic technologies in this space will require principled ways to create study designs that recover desired genomic information while minimizing various measures of cost. Results The central problem this paper attempts to address is how one might create an optimal study design for a genomic analysis, with particular focus on studies involving somatic variation that occur most often with application to cancer genomics. We pose the study design problem as a stochastic constrained nonlinear optimization problem. We introduce a Bayesian optimization framework that iteratively optimizes for an objective function using surrogate modeling combined with pattern and gradient search. We demonstrate our procedure on several test cases to derive resource and study design allocations optimized for various goals and criteria, demonstrating its ability to optimize study designs efficiently across diverse scenarios. Availability and implementation https://github.com/CMUSchwartzLab/StudyDesignOptimization
Motivation Genomic biotechnology has rapidly advanced, allowing for the inference and modification of genetic and epigenetic information at the single-cell level. While these tools hold enormous potential for basic and clinical research, they also raise difficult issues of how to design studies to deploy them most effectively. In designing a genomic study, a modern researcher might combine many sequencing modalities and sampling protocols, each with different utility, costs, and other tradeoffs. This is especially relevant for studies of somatic variation, which may involve highly heterogeneous cell populations whose differences can be probed via an extensive set of biotechnological tools. Efficiently deploying genomic technologies in this space will require principled ways to create study designs that recover desired genomic information while minimizing various measures of cost. Results The central problem this paper attempts to address is how one might create an optimal study design for a genomic analysis, with particular focus on studies involving somatic variation that occur most often with application to cancer genomics. We pose the study design problem as a stochastic constrained nonlinear optimization problem. We introduce a Bayesian optimization framework that iteratively optimizes for an objective function using surrogate modeling combined with pattern and gradient search. We demonstrate our procedure on several test cases to derive resource and study design allocations optimized for various goals and criteria, demonstrating its ability to optimize study designs efficiently across diverse scenarios. Availability and implementation https://github.com/CMUSchwartzLab/StudyDesignOptimization
Genomic biotechnologies have seen rapid development over the past two decades, allowing for both the inference and modification of genetic and epigenetic information at the single cell level. While these tools present enormous potential for basic research, diagnostics, and treatment, they also raise difficult issues of how to design research studies to deploy these tools most effectively. In designing a study at the population or individual level, a researcher might combine several different sequencing modalities and sampling protocols, each with different utility, costs, and other tradeoffs. The central problem this paper attempts to address is then how one might create an optimal study design for a genomic analysis, with particular focus on studies involving somatic variation, typically for applications in cancer genomics. We pose the study design problem as a stochastic constrained nonlinear optimization problem and introduce a simulation-centered optimization procedure that iteratively optimizes the objective function using surrogate modeling combined with pattern and gradient search. Finally, we demonstrate the use of our procedure on diverse test cases to derive resource and study design allocations optimized for various objectives for the study of somatic cell populations.
Background Cells and tissues have a remarkable ability to adapt to genetic perturbations via a variety of molecular mechanisms. Nonsense-induced transcriptional compensation, a form of transcriptional adaptation, has recently emerged as one such mechanism, in which nonsense mutations in a gene trigger upregulation of related genes, possibly conferring robustness at cellular and organismal levels. However, beyond a handful of developmental contexts and curated sets of genes, no comprehensive genome-wide investigation of this behavior has been undertaken for mammalian cell types and conditions. How the regulatory-level effects of inherently stochastic compensatory gene networks contribute to phenotypic penetrance in single cells remains unclear. Results We analyze existing bulk and single-cell transcriptomic datasets to uncover the prevalence of transcriptional adaptation in mammalian systems across diverse contexts and cell types. We perform regulon gene expression analyses of transcription factor target sets in both bulk and pooled single-cell genetic perturbation datasets. Our results reveal greater robustness in expression of regulons of transcription factors exhibiting transcriptional adaptation compared to those of transcription factors that do not. Stochastic mathematical modeling of minimal compensatory gene networks qualitatively recapitulates several aspects of transcriptional adaptation, including paralog upregulation and robustness to mutation. Combined with machine learning analysis of network features of interest, our framework offers potential explanations for which regulatory steps are most important for transcriptional adaptation. Conclusions Our integrative approach identifies several putative hits—genes demonstrating possible transcriptional adaptation—to follow-up on experimentally and provides a formal quantitative framework to test and refine models of transcriptional adaptation. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-024-03351-2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.