2020
DOI: 10.1101/2020.01.07.897892
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TLmutation: predicting the effects of mutations using transfer learning

Abstract: A reoccurring challenge in bioinformatics is predicting the phenotypic consequence of amino acid variation in proteins. Due to recent advances in sequencing techniques, sufficient genomic data is becoming available to train models that predict the evolutionary statistical energies for each sequence, but there is still inadequate experimental data to directly predict functional effects. One approach to overcome this data scarcity is to apply transfer learning and train more models with available datasets. In th… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
23
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(23 citation statements)
references
References 55 publications
0
23
0
Order By: Relevance
“…We assess all methods on 19 labelled mutagenesis datasets, each comprising hundreds to tens of thousands of mutant sequences. Most related works 21,23,24,27 evaluate on a subset of or all of the mutation effect datasets introduced by EVmutation 20 . We included all EVmutation 20 protein data sets with at least 100 entries in order to have sufficient data to glean insights from.…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…We assess all methods on 19 labelled mutagenesis datasets, each comprising hundreds to tens of thousands of mutant sequences. Most related works 21,23,24,27 evaluate on a subset of or all of the mutation effect datasets introduced by EVmutation 20 . We included all EVmutation 20 protein data sets with at least 100 entries in order to have sufficient data to glean insights from.…”
Section: Resultsmentioning
confidence: 99%
“…Each of these labelled data sets is paired with an evolutionary data set found by searching through the UniRef100 database 39 with the wild-type protein sequence as query. A jackhmmer 43 search is commonly used for retrieving evolutionary data 20,21,24,26 ; however, these search results can differ depending on the jackhammer parameters used. Thus, for simplicity and ease of comparison with other papers 20,21,24 , we use the MSAs provided by EVmutation 20 when available, and in other cases conduct our own jackhmmer with parameters as used in the EVmutation paper (see Methods).…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations