TEASEL: A Transformer-Based Speech-Prefixed Language Model

Arjmand, Mehdi; Dousti, Mohammad Javad; Moradi, Hadi

doi:10.48550/arxiv.2109.05522

Cited by 3 publications

(5 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Subsequently, Transformer was introduced into various domains including multimodal sentiment analysis, and spawned a series of significantly innovative approaches. The Transformer model can model temporal information in the data and process unimodal data through the self-attention mechanism, and Transformer can also achieve the interaction between different modalities [29][30][31][32][48][49][50][51][52]. Furthermore, the Transformer exhibits strong generalization capabilities, making it suitable for different types of multimodal sentiment analysis tasks.…”

Section: Methods Type Description Advantages Flawsmentioning

confidence: 99%

“…With the invention of Transformer [28] and its outstanding performance in the field of natural language processing, Transformer has been widely used in other research areas such as multimodal sentiment analysis. For example, [29][30][31][32] leverage the Transformer encoder to model correlation information between different modalities and have achieved good results in multimodal sentiment analysis. Some scholars have used tensor-based fusion methods [33][34][35] to solve the problem of fusion of multimodal features, and there are other researchers have adopted other methods such as self-supervised learning [36], contrastive learning [37], multi-task learning [38], etc.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis

Wang,

Shuxian,

Jinyan

2023

Preprint

View full text Add to dashboard Cite

With the development of the Internet, The content people share contains types of text, images, and videos, and utilizing these multimodal data for sentiment analysis has become an important area of research. Multimodal sentiment analysis aims to understand and perceive emotions or sentiments in different types of data. Currently, the realm of multimodal sentiment analysis faces various challenges, with a major emphasis on addressing two key issues: 1) Inefficiency when modeling the intra-modality and inter-modality dynamics and 2) Inability to effectively fuse multimodal features. In this paper, we proposed the CCDA(Cross-Correlation in Dual-Attention) model, a novel method to explore dynamics between different modalities and fuse multimodal features efficiently. We capture dynamics at intra- and inter-modal levels by using two types of attention mechanisms simultaneously. Meanwhile, the cross-correlation loss is introduced to capture the correlation between attention mechanisms. Moreover, the relevant coefficient is proposed to integrate multimodal features effectively. Extensive experiments were conducted on three publicly available datasets, CMU-MOSI, CMU-MOSEI, and CH-SIMS. The experimental results fully confirm the effectiveness of our proposed method, and compared with the current optimal method (SOTA), our model shows obvious advantages in most of the key metrics, proving its better performance in multimodal sentiment analysis.

show abstract

Section: Methods Type Description Advantages Flawsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis

Wang,

Shuxian,

Jinyan

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Chen et al [39] and Poria et al [11] used an LSTM-based model as well as attentional units to capture the dynamics across modalities. In [30][31][32][33][34], multihead and self-attention were used to capture relevant information within or across modalities. In addition, the researchers additionally used other methods, e.g., Gate Recurrent Unit (GRU) [35,36] and Graph Convolutional Network (GCN) [37].…”

Section: Attention-basedmentioning

confidence: 99%

“…Poria et al [11] used attention units to capture dynamics across modalities. In [30][31][32][33][34], multihead and self-attention were used to perform cross-modal interactions, respectively, and perceive emotional information that is not within the modality. In addition, the researchers used other attention-based methods such as Gate Recursive Units (GRUs) [35,36] and Graph Convolutional Networks (GCNs) [37].…”

Section: Introductionmentioning

confidence: 99%

CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis

Wang,

Liu,

Chen

2024

Applied Sciences

View full text Add to dashboard Cite

With the development of the Internet, the content that people share contains types of text, images, and videos, and utilizing these multimodal data for sentiment analysis has become an important area of research. Multimodal sentiment analysis aims to understand and perceive emotions or sentiments in different types of data. Currently, the realm of multimodal sentiment analysis faces various challenges, with a major emphasis on addressing two key issues: (1) inefficiency when modeling the intramodality and intermodality dynamics and (2) inability to effectively fuse multimodal features. In this paper, we propose the CCDA (cross-correlation in dual-attention) model, a novel method to explore dynamics between different modalities and fuse multimodal features efficiently. We capture dynamics at intra- and intermodal levels by using two types of attention mechanisms simultaneously. Meanwhile, the cross-correlation loss is introduced to capture the correlation between attention mechanisms. Moreover, the relevant coefficient is proposed to integrate multimodal features effectively. Extensive experiments were conducted on three publicly available datasets, CMU-MOSI, CMU-MOSEI, and CH-SIMS. The experimental results fully confirm the effectiveness of our proposed method, and, compared with the current optimal method (SOTA), our model shows obvious advantages in most of the key metrics, proving its better performance in multimodal sentiment analysis.

show abstract

“…Some researchers also use multiple self-attention blocks to combine different modes in pairs through the self-attention mechanism [15]. The RoBERTa model [16] is used to train audio data as a dynamic presentation of text features to achieve very good results.…”

Section: Multi-modal Sentiment Analysis Modelmentioning

confidence: 99%

Multi-Modal Sentiment Analysis Based on Interactive Attention Mechanism

Zhu

Zheng

et al. 2022

Applied Sciences

View full text Add to dashboard Cite

In recent years, multi-modal sentiment analysis has become more and more popular in the field of natural language processing. Multi-modal sentiment analysis mainly concentrates on text, image and audio information. Previous work based on BERT utilizes only text representation to fine-tune BERT, while ignoring the importance of nonverbal information. Most current research methods are fine-tuning models based on BERT that do not optimize BERT’s internal structure. Therefore, in this paper, we propose an optimized BERT model that is composed of three modules: the Hierarchical Multi-head Self Attention module realizes the hierarchical extraction process of the features; the Gate Channel module replaces BERT’s original Feed-Forward layer to realize information filtering; the tensor fusion model based on self-attention mechanism utilized to implement the fusion process of different modal features. In CMU-MOSI, a public mult-imodal sentiment analysis dataset, the accuracy and F1-Score were improved by 0.44% and 0.46% compared with the original BERT model using custom fusion. Compared with traditional models, such as LSTM and Transformer, they are improved to a certain extent.

show abstract

TEASEL: A Transformer-Based Speech-Prefixed Language Model

Cited by 3 publications

References 31 publications

CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis

CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis

CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis

Multi-Modal Sentiment Analysis Based on Interactive Attention Mechanism

Contact Info

Product

Resources

About