2021
DOI: 10.1016/j.neucom.2020.09.078
|View full text |Cite
|
Sign up to set email alerts
|

TWilBert: Pre-trained deep bidirectional transformers for Spanish Twitter

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 45 publications
(31 citation statements)
references
References 19 publications
0
30
0
1
Order By: Relevance
“…The attention mechanism is the crux behind many state-of-theart sequence-to-sequence models used in machine translation and language processing 40 and it has recently shown good results on multi-label classification. 41 While the attention mechanism has also been recently adopted to perform learning of relationships among elements in material property prediction, 34,35 our model additionally uses the attention mechanism to perform learning of relationships among multiple material properties by acting on the output of the multivariate Gaussian model as opposed to the composition itself.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…The attention mechanism is the crux behind many state-of-theart sequence-to-sequence models used in machine translation and language processing 40 and it has recently shown good results on multi-label classification. 41 While the attention mechanism has also been recently adopted to perform learning of relationships among elements in material property prediction, 34,35 our model additionally uses the attention mechanism to perform learning of relationships among multiple material properties by acting on the output of the multivariate Gaussian model as opposed to the composition itself.…”
Section: Discussionmentioning
confidence: 99%
“…Higher-order property correlation learning proceeds via an attention graph neural network, whose description can be found in prior literature. 34,35,40,41 We use five attention layers, namely, the message-passing operations are executed five times. Each attention layer also includes an element-wise feed-forward MLP which has two layers of 128 neurons each.…”
Section: H-clmp Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…The most recent works proposed language models specifically pre-trained on tweet corpora: Thakkar and Pinnis [ 16 ] achieved encouraging performance leveraging a time-balanced evaluation set for sentiment analysis on Latvian tweets, comparing several BERT-based architectures, and Nguyen et al [ 12 ] presented BERTweet, the first public large-scale pre-trained language model for English tweets; Ángel González et al [ 15 ] proposed TWiLBERT, a specialization of the BERT architecture both for the Spanish language and the Twitter domain. For languages other than English, such as Persian [ 53 ] and Arabic [ 54 ], recent studies have also focused on deep neural networks such as CNN and LSTM.…”
Section: Background and Related Workmentioning
confidence: 99%
“…In the field of sentiment analysis of tweets, most of the scientific literature has obtained state-of-the-art results adopting the approach of training language models directly from scratch starting from corpora made up exclusively of tweets, so that the models could better handle the specific tweet jargon, characterized by a particular syntax and grammar not containing punctuation, with contracted or elongated words, keywords, hashtags, emoticons, emojis and so on. These approaches, working not only in English [ 11 , 12 ], but also in other languages such as Italian [ 13 ], Spanish [ 14 , 15 ], and Latvian [ 16 ], necessarily impose two constraints: the first requires the building of large corpora of tweets to be used for training the language models in the specific language considered, and the second is the need for substantial resources, of both hardware and time, to train the models from scratch starting from these corpora.…”
Section: Introductionmentioning
confidence: 99%