2018
DOI: 10.48550/arxiv.1802.00923
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multi-attention Recurrent Network for Human Communication Comprehension

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
7

Relationship

4
3

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 0 publications
0
9
0
Order By: Relevance
“…Data is partitioned into predetermined train (12787 points), validation (3634 points) and test (1438 points) splits. Raw features for CMU-MOSI and CMU-MOSEI are obtained from CMU-Multimodal SDK [18] .…”
Section: Test Data Setsmentioning
confidence: 99%
“…Data is partitioned into predetermined train (12787 points), validation (3634 points) and test (1438 points) splits. Raw features for CMU-MOSI and CMU-MOSEI are obtained from CMU-Multimodal SDK [18] .…”
Section: Test Data Setsmentioning
confidence: 99%
“…Details of each type will be introduced in the following sections. 1) Extracted feature from audio clips (Ex0, Ex04): The spoken system of modern Chinese is named 'Hanyu Pinyin', abbreviated to 'pinyin' 3 . It is the official romanization system for mandarin in mainland China [51].…”
Section: Learning Phonetic Featuresmentioning
confidence: 99%
“…In recent years, sentiment analysis has become increasingly popular for processing social media data on online communities, blogs, wikis, microblogging platforms, and other online collaborative media [1]. Sentiment analysis is a branch of affective computing research [2], [3] that aims to classify text -but sometimes also audio and video [4], [5] -into either positive or negative -but sometimes also neutral [6]. Sentiment analysis techniques can be broadly categorized into symbolic and sub-symbolic approaches: the former include the use of lexicons [7], ontologies [8], and semantic networks [9] to encode the polarity associated with words and multiword expressions; the latter consist of supervised [10], semi-supervised [11] and unsupervised [12] machine learning techniques that perform sentiment classification based on word co-occurrence frequencies.…”
Section: Introductionmentioning
confidence: 99%
“…The dataset was composed of over 23,500 spoken sentence videos, totaling 65 hours, 53 minutes, and 36 seconds. The dataset had been segmented at the sentence level; the sentences had been transcribed, and audio, visual, and textual features had been generated and released as part a public Zadeh et al (2018b) software development kit (SDK). Additionally, raw videos were available for download.…”
Section: Moseimentioning
confidence: 99%
“…This large quantity of data comes from real-world expressions of sentiment, offering a unique opportunity to train and test model performance and generalization on a large dataset. Additionally, Zadeh et al (2018b) released a software development kit (SDK) for training and testing models on the CMU-MOSEI dataset, with future work focusing on addition of other multimodal datasets. These releases culminated in a challenge focused on human multimodal language with the opportunity to train a model and evaluate it on a held-out challenge test set.…”
Section: Introductionmentioning
confidence: 99%