<scp>S</scp>em<scp>G</scp>raph: Extracting keyphrases following a novel semantic graph‐based approach

Martínez-Romo, Juan; Araujo, Lourdes; Fernandez, Andres Duque

doi:10.1002/asi.23365

Cited by 34 publications

(21 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Liu et al (2009) changed the weight of nodes in the graph with a topic-related degree of words by Latent Dirichlet Allocation topic algorithm. To improve the accuracy of weight assignment of the relationship between words, Martinez-Romo, Araujo, and Fernandez (2016) proposed a strategy in which the relationship between words is measured by the significant co-occurrence and the relationship in WordNet. To test the significant co-occurrence of two words, a null model-based statistical hypothesis test was employed.…”

Section: Related Workmentioning

confidence: 99%

Joint Modeling of Characters, Words, and Conversation Contexts for Microblog Keyphrase Extraction

Zhang

2019

Asso for Info Science & Tech

View full text Add to dashboard Cite

Millions of messages are produced on microblog platforms every day, leading to the pressing need for automatic identification of key points from the massive texts. To absorb salient content from the vast bulk of microblog posts, this article focuses on the task of microblog keyphrase extraction. In previous work, most efforts treat messages as independent documents and might suffer from the data sparsity problem exhibited in short and informal microblog posts. On the contrary, we propose to enrich contexts via exploiting conversations initialized by target posts and formed by their replies, which are generally centered around relevant topics to the target posts and therefore helpful for keyphrase identification. Concretely, we present a neural keyphrase extraction framework, which has 2 modules: a conversation context encoder and a keyphrase tagger. The conversation context encoder captures indicative representation from their conversation contexts and feeds the representation into the keyphrase tagger, and the keyphrase tagger extracts salient words from target posts. The 2 modules were trained jointly to optimize the conversation context encoding and keyphrase extraction processes. In the conversation context encoder, we leverage hierarchical structures to capture the word‐level indicative representation and message‐level indicative representation hierarchically. In both of the modules, we apply character‐level representations, which enables the model to explore morphological features and deal with the out‐of‐vocabulary problem caused by the informal language style of microblog messages. Extensive comparison results on real‐life data sets indicate that our model outperforms state‐of‐the‐art models from previous studies.

show abstract

Section: Related Workmentioning

confidence: 99%

Joint Modeling of Characters, Words, and Conversation Contexts for Microblog Keyphrase Extraction

Zhang

2019

Asso for Info Science & Tech

View full text Add to dashboard Cite

show abstract

“…Current research proposes several and diverse methods for automatic text summarization such as statistical [22], machine learning [23,24], text connectivity [25,26], conceptual graphs [27,28,29], algebraic reduction [30], clustering and probabilistic models [31,32,33] and methods adapted to the reader [34,35].…”

Section: Automatic Text Summarizationmentioning

confidence: 99%

Towards Personalized Summaries in Spanish based on Learning Styles Theory

Ramírez¹,

Hernández²,

Martínez³

2019

RCS

View full text Add to dashboard Cite

Today, advances in information technologies have generated perhaps the largest and fasted exponential growing of electronic texts. On the Internet there are many electronic documents, such as books, technical documents, news articles, blogs, chats, emails and many other digital files. As a result, a user who wants to read and understand this information in a short time will find it a hard task. In this paper, we have conducted an important work in automatic text summarization. Also, we have considered the particular needs of readers. Thus, a model for personalized summarization base on learning styles theory is proposed.

show abstract

“…Several popular graph-based systems have been proposed by researchers for example, TextRank [22], SingleRank [23], ExpandRank [24], SGRank [41]. Some other graph-based methods are recently introduced [42][43][44][45]. Most of the graph-based keyphrase extraction methods prefer single words as nodes that may result in missing multiword phrases [1], which is one of the drawbacks of graph-based methods.…”

Section: Related Workmentioning

confidence: 99%

Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches

et al. 2018

View full text Add to dashboard Cite

Automatic key concept extraction from text is the main challenging task in information extraction, information retrieval and digital libraries, ontology learning, and text analysis. The statistical frequency and topical graph-based ranking are the two kinds of potentially powerful and leading unsupervised approaches in this area, devised to address the problem. To utilize the potential of these approaches and improve key concept identification, a comprehensive performance analysis of these approaches on datasets from different domains is needed. The objective of the study presented in this paper is to perform a comprehensive empirical analysis of selected frequency and topical graph-based algorithms for key concept extraction on three different datasets, to identify the major sources of error in these approaches. For experimental analysis, we have selected TF-IDF, KP-Miner and TopicRank. Three major sources of error, i.e., frequency errors, syntactical errors and semantical errors, and the factors that contribute to these errors are identified. Analysis of the results reveals that performance of the selected approaches is significantly degraded by these errors. These findings can help us develop an intelligent solution for key concept extraction in the future.

show abstract

SemGraph: Extracting keyphrases following a novel semantic graph‐based approach

Cited by 34 publications

References 17 publications

Joint Modeling of Characters, Words, and Conversation Contexts for Microblog Keyphrase Extraction

Joint Modeling of Characters, Words, and Conversation Contexts for Microblog Keyphrase Extraction

Towards Personalized Summaries in Spanish based on Learning Styles Theory

Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches

Contact Info

Product

Resources

About