2024
DOI: 10.1093/bioinformatics/btae340
|View full text |Cite
|
Sign up to set email alerts
|

SuPreMo: a computational tool for streamlining in silico perturbation using sequence-based predictive models

Ketrin Gjoni,
Katherine S Pollard

Abstract: Summary The increasing development of sequence-based machine learning models has raised the demand for manipulating sequences for this application. However, existing approaches to edit and evaluate genome sequences using models have limitations, such as incompatibility with structural variants, challenges in identifying responsible sequence perturbations, and the need for vcf file inputs and phased data. To address these bottlenecks, we present Sequence Mutator for Predictive Models (SuPreMo)… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 35 publications
0
1
0
Order By: Relevance
“…To quantify the influence of short DNA sequences on genome folding, we defined a disruption score as the square root of the sum of squared differences between predicted maps before and after local sequence perturbations (Fig 1), as previously used to interpret Akita's predictions [19,27]. This disruption score is sensitive to gain or loss of boundaries, as well as changes in TAD substructures [28,29]. We leveraged the ensemble of models to validate sequence perturbation approaches at CTCF sites by their cross-model stability.…”
Section: Cross Species Modelmentioning
confidence: 99%
“…To quantify the influence of short DNA sequences on genome folding, we defined a disruption score as the square root of the sum of squared differences between predicted maps before and after local sequence perturbations (Fig 1), as previously used to interpret Akita's predictions [19,27]. This disruption score is sensitive to gain or loss of boundaries, as well as changes in TAD substructures [28,29]. We leveraged the ensemble of models to validate sequence perturbation approaches at CTCF sites by their cross-model stability.…”
Section: Cross Species Modelmentioning
confidence: 99%