2020
DOI: 10.1007/978-3-030-45442-5_41
|View full text |Cite
|
Sign up to set email alerts
|

Keyphrase Extraction as Sequence Labeling Using Contextualized Embeddings

Abstract: In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings. We evaluate the proposed architecture using both contextualized and fixed word embedding models on three different benchmark datasets (Inspec, SemEval 2010, SemEval 2017), and compare with existing popular unsupervised and supervised techniques. Our results quantify the benefits of: (a) using conte… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
57
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 57 publications
(57 citation statements)
references
References 23 publications
0
57
0
Order By: Relevance
“…Problem Formulation Similar to recent works Sahrawat et al, 2020), we formulate keyphrase extraction as a sequence labeling task. Let D = (t 1 , t 2 , ..., t n ) be a document consisting of n tokens, where t i represents the i th token of the document.…”
Section: Preliminariesmentioning
confidence: 99%
See 2 more Smart Citations
“…Problem Formulation Similar to recent works Sahrawat et al, 2020), we formulate keyphrase extraction as a sequence labeling task. Let D = (t 1 , t 2 , ..., t n ) be a document consisting of n tokens, where t i represents the i th token of the document.…”
Section: Preliminariesmentioning
confidence: 99%
“…Baseline Models In this work, we employ the BiLSTM-CRF architecture as the baseline architecture (Huang et al, 2015;Alzaidy et al, 2019;Sahrawat et al, 2020;Zhu et al, 2020). Figure 1 shows a high-level overview of our baseline model.…”
Section: Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation
“…Keyphrase generation is the process of predicting both extractive and abstractive keyphrases from a given document. Most of the previous works in keyphrase domain, including both supervised and unsupervised techniques, primarily focus on extractive keyphrases (Hasan and Ng, 2014;Mahata et al, 2018;Sahrawat et al, 2020). Recent studies Meng et al (2017); Ye and Wang (2018); Chan et al (2019) have started to develop generative approaches that produce both abstractive and extractive keyphrases from documents.…”
Section: Introductionmentioning
confidence: 99%
“…The first corpora for automated keyphrase extraction were likewise assembled out of publications from scientific fields including technical reports (Witten et al, 1999), paper abstracts (Hulth, 2003), and scientific papers (Nguyen and Kan, 2007;Medelyan et al, 2009;Kim et al, 2010). To this day, scientific publications still serve as a fundamental fixed-domain benchmark for neural KPE methods (Meng et al, 2017;Alzaidy et al, 2019;Sahrawat et al, 2019) due to the availability of ample data of this kind. However, experiments have revealed that KPE methods trained directly on such corpora do not generalize well to other web-related genres or other types of documents (Chen et al, 2018;Xiong et al, 2019), where there may be far more heterogeneity in topics, content and structure, and there may be more variation in terms of where a key phrase may appear.…”
Section: Introductionmentioning
confidence: 99%