Unsupervised Learning of Sentence Embeddings Using Compositional
            n-Gram Features

Pagliardini, Matteo; Gupta, Prakhar; Jäggi, Martin

doi:10.18653/v1/n18-1049

Cited by 494 publications

(407 citation statements)

References 30 publications

Supporting

Mentioning

403

Contrasting

Unclassified

Order By: Relevance

“…However, word embeddings came to the foreground by Mikolov, Chen, Corrado, and Dean (), who presented the popular Continuous Bag‐of‐Words model (CBOW) and the Continuous Skip‐gram model. Additionally, sentence embeddings (Doc2Vec (Lau & Baldwin, ) or Sent2vec (Pagliardini, Gupta, & Jaggi, )) as well as the popular GloVe (Global Vectors) (Pennington, Socher, & Manning, ) method are utilized by keyphrase extraction methods.…”

Section: Unsupervised Methodsmentioning

confidence: 99%

A review of keyphrase extraction

Papagiannopoulou

Tsoumakas

2019

WIREs Data Min & Knowl

135

View full text Add to dashboard Cite

Keyphrase extraction is a textual information processing task concerned with the automatic extraction of representative and characteristic phrases from a document that express all the key aspects of its content. Keyphrases constitute a succinct conceptual summary of a document, which is very useful in digital information management systems for semantic indexing, faceted search, document clustering and classification. This article introduces keyphrase extraction, provides a well‐structured review of the existing work, offers interesting insights on the different evaluation approaches, highlights open issues and presents a comparative experimental study of popular unsupervised techniques on five datasets. This article is categorized under: Ensemble Methods > Web Mining Ensemble Methods > Text Mining

show abstract

Section: Unsupervised Methodsmentioning

confidence: 99%

A review of keyphrase extraction

Papagiannopoulou

Tsoumakas

2019

WIREs Data Min & Knowl

135

View full text Add to dashboard Cite

show abstract

“…7 Yin et al (2016) [23] 78.9 84. 8 Pagliardini et al (2018) [43] 76.4 83. 4 Ferreira et al (2018) [44] 74.08 83.…”

Section: Msrp Datasetmentioning

confidence: 98%

A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts

Shakeel

Karim

Khan

2020

Information Processing & Management

View full text Add to dashboard Cite

Paraphrase detection is an important task in text analytics with numerous applications such as plagiarism detection, duplicate question identification, and enhanced customer support helpdesks. Deep models have been proposed for representing and classifying paraphrases. These models, however, require large quantities of human-labeled data, which is expensive to obtain. In this work, we present a data augmentation strategy and a multi-cascaded model for improved paraphrase detection in short texts. Our data augmentation strategy considers the notions of paraphrases and non-paraphrases as binary relations over the set of texts. Subsequently, it uses graph theoretic concepts to efficiently generate additional paraphrase and non-paraphrase pairs in a sound manner. Our multi-cascaded model employs three supervised feature learners (cascades) based on CNN and LSTM networks with and without soft-attention. The learned features, together with hand-crafted linguistic features, are then forwarded to a discriminator network for final classification. Our model is both wide and deep and provides greater robustness across clean and noisy short texts. We evaluate our approach on three benchmark datasets and show that it produces a comparable or state-of-the-art performance on all three.• We present an efficient strategy for augmenting existing paraphrase and non-paraphrase annotations in a consistent manner. This strategy generates additional annotations and enhances the performance of the data-hungry deep learning models.• We develop a multi-cascaded learning model for robust paraphrase detection in both clean and noisy texts. This model incorporates multiple learned and linguistic features in a wide and deep discriminator network for paraphrase detection.• We address both clean and noisy texts in our presentation and show that the proposed model matches current best performances on benchmark datasets of both types.• We analyze the impact of various data augmentation steps and different components of the multicascaded model on paraphrase detection performance.

show abstract

“…The classifiers tried to predict the respective labels from the text of the tweet alone. In the process, we analyzed the performance of four different classifier models: Bag of Words, Sent2Vec sentence embeddings [38] coupled with Support Vector Machines (SVMs) [39], FastText [40], and BERT [34]. The tokenization and word character encoding process was different for each model class.…”

Section: Trainingmentioning

confidence: 99%

Combining Crowdsourcing and Deep Learning to Assess Public Opinion on CRISPR-Cas9

Müller¹,

Schneider²,

Salathé³

et al. 2019

Preprint

View full text Add to dashboard Cite

The discovery of the CRISPR-Cas9-based gene editing method has opened unprecedented new potential for biological and medical engineering, sparking a growing public debate on both the potential and dangers of CRISPR applications. Given the speed of technology development, and the almost instantaneous global spread of news, its important to follow evolving debates without much delay and in sufficient detail, as certain events may have a major long-term impact on public opinion and later influence policy decisions. Social media networks such as Twitter have shown to be major drivers of news dissemination and public discourse. They provide a vast amount of semi-structured data in almost real-time and give direct access to the content of the conversations. Such data can now be mined and analyzed quickly because of recent developments in machine learning and natural language processing. Here, we used BERT, an attention-based transformer model, in combination with statistical methods to analyse the entirety of all tweets ever published on CRISPR since the publication of the first gene editing application in 2013. We show that the mean sentiment of tweets was initially very positive, but began to decrease over time, and that this decline was driven by rare peaks of strong negative sentiments. Due to the high temporal resolution of the data, we were able to associate these peaks with specific events, and to observe how trending topics changed over time. Overall, this type of analysis can provide valuable and complementary insights into ongoing public debates, extending the traditional empirical bioethics toolset.

show abstract

Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features

Cited by 494 publications

References 30 publications

A review of keyphrase extraction

A review of keyphrase extraction

A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts

Combining Crowdsourcing and Deep Learning to Assess Public Opinion on CRISPR-Cas9

Contact Info

Product

Resources

About