2020
DOI: 10.1016/j.elerap.2020.101003
|View full text |Cite
|
Sign up to set email alerts
|

seq2vec: Analyzing sequential data using multi-rank embedding vectors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 39 publications
0
8
0
Order By: Relevance
“…While previous methods have focused on the longer proteins that represent a larger fraction of the known proteome this research focuses on small antimicrobial peptides (AMPs). Much of the previous work discovering protein embeddings with deep neural networks has used large latent space representations [34][35][36] to maximize data throughput or graph-based representations which require the use of graph neural networks to process the protein graphs. In this work we emphasize small latent representations and model interpretability in order to construct interpretable search spaces for AMP design.…”
Section: Introductionmentioning
confidence: 99%
“…While previous methods have focused on the longer proteins that represent a larger fraction of the known proteome this research focuses on small antimicrobial peptides (AMPs). Much of the previous work discovering protein embeddings with deep neural networks has used large latent space representations [34][35][36] to maximize data throughput or graph-based representations which require the use of graph neural networks to process the protein graphs. In this work we emphasize small latent representations and model interpretability in order to construct interpretable search spaces for AMP design.…”
Section: Introductionmentioning
confidence: 99%
“…Time series often exhibit multiple properties which are hard to handle for humans (Rojat et al 2021) and DL models (Shen, Wei, and Wang 2022), such as complex time relations, non-normal distributions, non-stationarity, noise/anomalies as well as having lot of redundant but highly interrelated information (Shen, Wei, and Wang 2022;Kim, Hong, and Cha 2020). Hence, visual clues can be misleading due to encoded information, e. g., frequencies.…”
Section: Problems Of Embedding a Time Seriesmentioning
confidence: 99%
“…One major difference in DL modeling concerns the representation of the data, where often an embedding is introduced to better cope with internal information, e. g., word2vec (Mikolov et al 2013). Recently, a lot of new time series embedding methods emerged, e. g., Kim, Hong, and Cha (2020); Chengyang and Qiang (2022); Cheng et al (2020); Yue et al (2022); Ye and Ma (2022); Tabassum, Menon, and Jastrzebska (2022); Boniol and Palpanas (2022), demonstrating that they can decrease model run time, structure information more informative, as well as improve the model performance. However, compared to words, for example, time series data is often regarded as rather complex for human interpretation.…”
Section: Introductionmentioning
confidence: 99%
“…The Hugging Face library has become the standard library for NLP researchers, and several state-of-the-art model implementations exist, such as GPT-2, BERT, RoBERTa, etc., with or without pre-trained model parameters. Using AllenNLP and BERT-as-service [40], we implemented a text classification model that embeds the input sequence, encodes it with a seq2vec ( [41], p. 2) encoder, and finally classifies it with the help of a SoftMax layer coupled with a classification head. The embedding is done by BERT, BertPooler [38] did the encoding, and Adam [38] was employed as the optimizer.…”
Section: Cse-persistencebertmentioning
confidence: 99%