Massively parallel phenotyping assays have provided unprecedented insight into how multiple mutations combine to determine biological function. While such assays can measure phenotypes for thousands to millions of genotypes in a single experiment, in practice these measurements are not exhaustive, so that there is a need for techniques to impute values for genotypes whose phenotypes have not been directly assayed. Here, we present an imputation method based on inferring the least epistatic possible sequence-function relationship compatible with the data. In particular, we infer the reconstruction where mutational effects change as little as possible across adjacent genetic backgrounds. The resulting models can capture complex higher-order genetic interactions near the data, but approach additivity where data is sparse or absent. We apply the method to high-throughput transcription factor binding assays and use it to explore a fitness landscape for protein G.
Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype–phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype–phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA 5 ′ splice sites, for which we also validate our model predictions via additional low-throughput experiments.
Conflicting selection is an important evolutionary mechanism because it impedes directional evolution and helps to maintain phenotypic variation. It can arise when mutualistic and antagonistic selective agents exert opposing selection on the same trait and when distinct phenotypic optima are favored by different fitness components. In this study, we test for conflicting selection through different sexual functions of the hermaphroditic plant, Silene stellata during its early and late flowering season. We find selection is consistently stronger during the early flowering season, which aligns with the activity peak of the pollinating seed predator Hadena ectypa. Importantly, we observe sex‐specific selection on petal dimensions to have opposite signs. We propose that the observed sexually conflicting selection on petal design results from the negative selection through female function for the avoidance of oviposition and the subsequent fruit predation by H. ectypa larvae and the positive selection through male function for pollen export by H. ectypa adults. The Silene–Hadena interaction has previously been considered to be largely parasitic. Our findings suggest a trade‐off mechanism that could thwart the evolution of an “escape route” from the nocturnal pollination syndrome by Silene spp. and contribute to the long‐term maintenance of the Silene–Hadena system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.