2020
DOI: 10.1101/2020.03.17.995498
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

dom2vec: Unsupervised protein domain embeddings capture domains structure and function providing data-driven insights into collocations in domain architectures

Abstract: Motivation: Word embedding approaches have revolutionized Natural Lange Processing (NLP) research. These approaches aim to map words to a low-dimensional vector space in which words with similar linguistic features are close in the vector space. These NLP approaches also preserve local linguistic features, such as analogy. Embedding-based approaches have also been developed for proteins. To date, such approaches treat amino acids as words, and proteins are treated as sentences of amino acids. These approaches … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 40 publications
0
1
0
Order By: Relevance
“…Word embeddings can also be used to solve supervised and unsupervised machine learning problems, for instance, as inputs to a model that predicts the class of a given word, or to estimate a continuous value associated with the word. Recently, neural embeddings have also been applied to other types of objects such as graphs [15] and biological sequences [8,26].…”
Section: Introductionmentioning
confidence: 99%
“…Word embeddings can also be used to solve supervised and unsupervised machine learning problems, for instance, as inputs to a model that predicts the class of a given word, or to estimate a continuous value associated with the word. Recently, neural embeddings have also been applied to other types of objects such as graphs [15] and biological sequences [8,26].…”
Section: Introductionmentioning
confidence: 99%