2021
DOI: 10.1101/2021.06.30.450414
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Probing molecular specificity with deep sequencing and biophysically interpretable machine learning

Abstract: Quantifying sequence-specific protein-ligand interactions is critical for understanding and exploiting numerous cellular processes, including gene regulation and signal transduction. Next-generation sequencing (NGS) based assays are increasingly being used to profile these interactions with high-throughput. However, these assays do not provide the biophysical parameters that have long been used to uncover the quantitative rules underlying sequence recognition. We developed a highly flexible machine learning fr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 75 publications
0
4
0
Order By: Relevance
“…Traditionally, extracting affinities from in vivo assays has focused on discovering short motifs enriched within bound loci and then using these motif-based models to predict DNA binding with somewhat limited success (105)(106)(107). By contrast, modern machine learning methods such as deep neural networks can predict binding with high accuracy but are sometimes dismissed as overparameterized black box models with no way to extract biophysical information (108). To remedy this, some studies have suggested building stereotyped networks with fixed architectures, sacrificing flexibility in modeling 13 and training to obtain nodes and weights that have explicit biophysical interpretations (109,110).…”
Section: Discussionmentioning
confidence: 99%
“…Traditionally, extracting affinities from in vivo assays has focused on discovering short motifs enriched within bound loci and then using these motif-based models to predict DNA binding with somewhat limited success (105)(106)(107). By contrast, modern machine learning methods such as deep neural networks can predict binding with high accuracy but are sometimes dismissed as overparameterized black box models with no way to extract biophysical information (108). To remedy this, some studies have suggested building stereotyped networks with fixed architectures, sacrificing flexibility in modeling 13 and training to obtain nodes and weights that have explicit biophysical interpretations (109,110).…”
Section: Discussionmentioning
confidence: 99%
“…In practice, one of two methods is used to overcome the difficulties that gauge freedoms present. One method, called "gauge fixing", removes gauge freedoms by introducing additional constraints on model parameters (2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18).…”
Section: Introductionmentioning
confidence: 99%
“…In practice, one of two methods is typically used to overcome the difficulties that such gauge freedoms can present. One method-called "gauge fixing"-removes gauge freedoms by introducing additional constraints on model parameters (2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18). Another method limits the mathematical models that one uses to models that do not have any gauge freedoms (19)(20)(21)(22)(23)(24).…”
Section: Introductionmentioning
confidence: 99%
“…Additionally, ML techniques have been employed for protein fitness prediction [3]- [5], which enables the design of proteins with specific functions or properties. They are also used for forecasting protein-ligand binding affinity [6], [7], a critical aspect of drug discovery. Moreover, large language models have been pretrained on extensive protein sequence databases [8], [9], enabling them to capture intricate sequence-structure-function relationships.…”
Section: Introductionmentioning
confidence: 99%