The 5′ untranslated region (5′ UTR) of the messenger RNA plays a crucial role in the translatability and stability of a molecule. Thus, it is an important component in the design of synthetic biological circuits for high and stable expression of intermediate proteins. Several UTR sequences are patented and used frequently in laboratories. We present a novel model UTRGAN, a Generative Adversarial Network (GAN)-based model designed to generate 5′ UTR sequences coupled with an optimization procedure to ensure a target property such as high expression for a target gene sequence or high ribosome load. We rigorously analyze and show that the model can generate sequences that mimic various properties of natural UTR sequences. Then, we show that the optimization procedure yields sequences that are expected to yield 32% higher expression (up to 7-fold) on a set of target genes and 12% higher ribosome load on average on a set of generated 5′ UTRs (up to 90% for the best 5′ UTR), compared to the initially generated UTR sequences. We also demonstrate that when there is a single target gene of interest, the expected expression increases by 55% on average and up to 100% for certain genes (up to 15-fold for the best 5′ UTR).
RNA - protein binding plays an important role in regulating protein activity by affecting localization and stability. While proteins are usually targeted via small molecules or other proteins, easy-to-design and synthesize small RNAs are a rather unexplored and promising venue. The problem is the lack of methods to generate RNA molecules that have the potential to bind to certain proteins. Here, we propose a method based on generative adversarial networks (GAN) that learn to generate short RNA sequences with natural RNA-like properties such as secondary structure and free energy. Using an optimization technique, we fine-tune these sequences to have them bind to a target protein. We use RNA-protein binding prediction models from the literature to guide the model. We show that even if there is no available guide model trained specifically for the target protein, we can use models trained for similar proteins, such as proteins from the same family, to successfully generate a binding RNA molecule to the target protein. Using this approach, we generated piRNAs that are tailored to bind to SOX2 protein using models trained for its relative (SOX15, SOX14, and SOX7) and experimentally validated in vitro that the top-2 molecules we generated specifically bind to SOX2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.