seq2vec: Analyzing sequential data using multi-rank embedding vectors

Kim, Hwa Jong; Hong, Seong Eun; Jin, Kyung

doi:10.1016/j.elerap.2020.101003

Cited by 16 publications

(12 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While previous methods have focused on the longer proteins that represent a larger fraction of the known proteome this research focuses on small antimicrobial peptides (AMPs). Much of the previous work discovering protein embeddings with deep neural networks has used large latent space representations [34][35][36] to maximize data throughput or graph-based representations which require the use of graph neural networks to process the protein graphs. In this work we emphasize small latent representations and model interpretability in order to construct interpretable search spaces for AMP design.…”

Section: Introductionmentioning

confidence: 99%

Latent spaces for antimicrobial peptide design

Renaud

Mansbach

2023

Digital Discovery

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 99%

Latent spaces for antimicrobial peptide design

Renaud

Mansbach

2023

Digital Discovery

View full text Add to dashboard Cite

show abstract

“…Time series often exhibit multiple properties which are hard to handle for humans (Rojat et al 2021) and DL models (Shen, Wei, and Wang 2022), such as complex time relations, non-normal distributions, non-stationarity, noise/anomalies as well as having lot of redundant but highly interrelated information (Shen, Wei, and Wang 2022;Kim, Hong, and Cha 2020). Hence, visual clues can be misleading due to encoded information, e. g., frequencies.…”

Section: Problems Of Embedding a Time Seriesmentioning

confidence: 99%

“…One major difference in DL modeling concerns the representation of the data, where often an embedding is introduced to better cope with internal information, e. g., word2vec (Mikolov et al 2013). Recently, a lot of new time series embedding methods emerged, e. g., Kim, Hong, and Cha (2020); Chengyang and Qiang (2022); Cheng et al (2020); Yue et al (2022); Ye and Ma (2022); Tabassum, Menon, and Jastrzebska (2022); Boniol and Palpanas (2022), demonstrating that they can decrease model run time, structure information more informative, as well as improve the model performance. However, compared to words, for example, time series data is often regarded as rather complex for human interpretation.…”

Section: Introductionmentioning

confidence: 99%

Making Time Series Embeddings More Interpretable in Deep Learning

Schwenke

Atzmueller

2023

FLAIRS

View full text Add to dashboard Cite

With the success of language models in deep learning, multiple new time series embeddings have been proposed. However, the interpretability of those representations is often still lacking compared to word embeddings. This paper tackles this issue, aiming to present some criteria for making time series embeddings applied in deep learning models more interpretable using higher-level features in symbolic form. For that, we investigate two different approaches for extracting symbolic approximation representations regarding the frequency and the trend information, i.e. the Symbolic Fourier Approximation (SFA) and the Symbolic Aggregate approXimation (SAX). In particular, we analyze and discuss the impact of applying the different representation approaches. Furthermore, in our experimentation, we apply a state-of-the-art Transformer model to demonstrate the efficacy of the proposed approach regarding explainability in a comprehensive evaluation using a large set of time series datasets.

show abstract

“…The Hugging Face library has become the standard library for NLP researchers, and several state-of-the-art model implementations exist, such as GPT-2, BERT, RoBERTa, etc., with or without pre-trained model parameters. Using AllenNLP and BERT-as-service [40], we implemented a text classification model that embeds the input sequence, encodes it with a seq2vec ( [41], p. 2) encoder, and finally classifies it with the help of a SoftMax layer coupled with a classification head. The embedding is done by BERT, BertPooler [38] did the encoding, and Adam [38] was employed as the optimizer.…”

Section: Cse-persistencebertmentioning

confidence: 99%

Applying BERT for Early-Stage Recognition of Persistence in Chat-Based Social Engineering Attacks

Tsinganos

Fouliras²,

Mavridis³

2022

Applied Sciences

View full text Add to dashboard Cite

Chat-based social engineering (CSE) attacks are attracting increasing attention in the Small-Medium Enterprise (SME) environment, given the ease and potential impact of such an attack. During a CSE attack, malicious users will repeatedly use linguistic tricks to eventually deceive their victims. Thus, to protect SME users, it would be beneficial to have a cyber-defense mechanism able to detect persistent interlocutors who repeatedly bring up critical topics that could lead to sensitive data exposure. We build a natural language processing model, called CSE-PersistenceBERT, for paraphrase detection to recognize persistency as a social engineering attacker’s behavior during a chat-based dialogue. The CSE-PersistenceBERT model consists of a pre-trained BERT model fine-tuned using our handcrafted CSE-Persistence corpus; a corpus appropriately annotated for the specific downstream task of paraphrase recognition. The model identifies the linguistic relationship between the sentences uttered during the dialogue and exposes the malicious intent of the attacker. The results are satisfactory and prove the efficiency of CSE-PersistenceBERT as a recognition mechanism of a social engineer’s persistent behavior during a CSE attack.

show abstract

seq2vec: Analyzing sequential data using multi-rank embedding vectors

Cited by 16 publications

References 39 publications

Latent spaces for antimicrobial peptide design

Latent spaces for antimicrobial peptide design

Making Time Series Embeddings More Interpretable in Deep Learning

Applying BERT for Early-Stage Recognition of Persistence in Chat-Based Social Engineering Attacks

Contact Info

Product

Resources

About