Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries 2018
DOI: 10.1145/3197026.3203869
|View full text |Cite
|
Sign up to set email alerts
|

Keyphrase Extraction Based on Prior Knowledge

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
1

Year Published

2020
2020
2021
2021

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 3 publications
0
7
1
Order By: Relevance
“…Bag of words (BoWs) is one of the most commonly used representations for text classification, an example being keyphrase extraction ( Caragea et al, 2016 ; He et al, 2018 ). BoW represents text as a set of unordered word-level tokens, without considering syntactical and sequential information.…”
Section: Related Workmentioning
confidence: 99%
“…Bag of words (BoWs) is one of the most commonly used representations for text classification, an example being keyphrase extraction ( Caragea et al, 2016 ; He et al, 2018 ). BoW represents text as a set of unordered word-level tokens, without considering syntactical and sequential information.…”
Section: Related Workmentioning
confidence: 99%
“…We randomly select up to 150K abstracts per SC. This upper limit is based on our preliminary study (Wu et al, 2018). The ratio between the training and testing corpus is 9:1.…”
Section: Methodsmentioning
confidence: 99%
“…The majority of words and phrases included in the vocabulary extracted from these articles provides general descriptions of knowledge, which are significantly different from those used in scholarly articles which describe specific domain knowledge. Statistically, the overlap between the vocabulary of pretrained GloVe (6 billion tokens) and WoS is only 37% (Wu et al, 2018). Nearly all of the WE models can be retrained.…”
Section: Retrained We Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…It captures the variation in terms and provides a consistent method to assign preferred terms for similar concepts [33]. Previous literature shows that integrating a controlled vocabulary into the extraction process of words and phrases can enhance the results [34][35][36]. We performed this phase using python scripts and delivered the data that were cleaned up in the csv file format.…”
Section: Data Pre-processingmentioning
confidence: 99%