2020
DOI: 10.1109/access.2020.3027567
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Study of TextRank for Keyword Extraction

Abstract: As a typical keyword extraction technology, TextRank has been used in a wide variety of commercial applications, including text classification, information retrieval and clustering. In these applications, the parameters of TextRank, including the co-occurrence window size, iteration number and decay factor, are set roughly, which might affect the effectiveness of returned results. In this work, we conduct an empirical study on TextRank, towards finding optimal parameter settings for keyword extraction. The exp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 36 publications
(11 citation statements)
references
References 41 publications
0
11
0
Order By: Relevance
“…Pre-processing of input data is one of the important tasks to transform the unorganized data into a structural format [36]. It aims to improve the quality of the input data for models to better understand the patterns and extract distinctive features from the input data.…”
Section: B Pre-processingmentioning
confidence: 99%
“…Pre-processing of input data is one of the important tasks to transform the unorganized data into a structural format [36]. It aims to improve the quality of the input data for models to better understand the patterns and extract distinctive features from the input data.…”
Section: B Pre-processingmentioning
confidence: 99%
“… • PS (Porter Stemmer) was used to index each news page/article to filter out any stop, repeated, and common words to avoid noise in the dataset. The algorithm was used, over several rounds, to remove any non-relevant words from the datasets/textual scripts before considering all criteria or defined rules (Zhang et al, 2020 ). Such an algorithm has been proven to be one of the best techniques in terms of performance (Joshi et al, 2016 ).…”
Section: Implementation Procedures Findings and Proposed Modelmentioning
confidence: 99%
“…With the emergence of word vector technology that converts words into numerical vectors, word meaning measurement becomes possible. The main derived word vector generation models include Word2vec [15], G1oVe [16], ELMo [17], and BERT [18]. The most commonly used are the Word2vec model and the BERT model.…”
Section: The Research Statusmentioning
confidence: 99%