2019
DOI: 10.1101/2019.12.31.890699
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sequence representations and their utility for predicting protein-protein interactions

Abstract: Protein-Protein Interactions (PPIs) are a crucial mechanism underpinning the function of the cell. Predicting the likely relationship between a pair of proteins is thus an important problem in bioinformatics, and a wide range of machine-learning based methods have been proposed for this task. Their success is heavily dependent on the construction of the feature vectors, with most using a set of physico-chemical properties derived from the sequence. Few work directly with the sequence itself.Recent works on emb… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 46 publications
(60 reference statements)
0
2
0
Order By: Relevance
“…Another embedding method, doc2vec 42 includes the whole context to some extent and performs better than word2vec on selected tasks. Several methods use doc2vec to represent proteins 5,27,[97][98][99][100][101] . Also, deep language models, such as BERT 91 and ELMO 46 were originally developed for NLP, and later employed for protein representations 23,28 .…”
Section: Different Approaches For Representing Proteinsmentioning
confidence: 99%
“…Another embedding method, doc2vec 42 includes the whole context to some extent and performs better than word2vec on selected tasks. Several methods use doc2vec to represent proteins 5,27,[97][98][99][100][101] . Also, deep language models, such as BERT 91 and ELMO 46 were originally developed for NLP, and later employed for protein representations 23,28 .…”
Section: Different Approaches For Representing Proteinsmentioning
confidence: 99%
“…Asgari et al proposed BioVec based on the skip-gram model for biological sequences representation ( Asgari and Mofrad, 2015 ). Kimothi et al developed a model named seq2vec based on doc2vec, which is an extension of the original word2vec ( Kimothi et al, 2016 ). The dna2vec model is dedicated to representing variable-length words ( Ng, 2017a ).…”
Section: Introductionmentioning
confidence: 99%