2010
DOI: 10.1007/978-3-642-13654-2_12
|View full text |Cite
|
Sign up to set email alerts
|

Keyphrases Extraction from Scientific Documents: Improving Machine Learning Approaches with Natural Language Processing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 37 publications
(25 citation statements)
references
References 9 publications
0
25
0
Order By: Relevance
“…The Hulth2003 3 [12] dataset is a document summary dataset from the Inspec physical and engineering literature database, in which each document has two sets of keywords assigned: the manually controlled assigned keywords appear in the Inspec thesaurus instead of document, and the uncontrolled keywords, which are freely assigned by the editors; (2) Krapivin2009. Krapivin2009 4 [13] dataset has 2,304 full papers from the computer science domain, which are published by ACM in the period between 2003 and 2005, and each paper has keywords originally labeled by the authors and verified by the reviewers. In the experiments, we extract the abstracts of papers from datasets, and use user-assigned keywords as the ground truth.…”
Section: Experimental Settings a Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…The Hulth2003 3 [12] dataset is a document summary dataset from the Inspec physical and engineering literature database, in which each document has two sets of keywords assigned: the manually controlled assigned keywords appear in the Inspec thesaurus instead of document, and the uncontrolled keywords, which are freely assigned by the editors; (2) Krapivin2009. Krapivin2009 4 [13] dataset has 2,304 full papers from the computer science domain, which are published by ACM in the period between 2003 and 2005, and each paper has keywords originally labeled by the authors and verified by the reviewers. In the experiments, we extract the abstracts of papers from datasets, and use user-assigned keywords as the ground truth.…”
Section: Experimental Settings a Datasetsmentioning
confidence: 99%
“…In this paper, we conduct an empirical study on TextRank, towards finding an optimal parameter setting for keyword extraction. The experiments are done in Hulth2003 [12] and Krapivin2009 [13] datasets. In the text preprocessing stage, we remove the stop word by an open published English stop word list XPO6.…”
Section: Introductionmentioning
confidence: 99%
“…NLP techniques were used by Krapivin et al (2010) in [18] to consider machine learning approach and improve them (SVM, Local SVM, Random Forests) to solve the problem of automatic keyphrases extraction from scientific papers. Evaluation showed efficient results that can that outperform state-of-the-art Bayesian learning system KEA on the same dataset without the use of controlled vocabularies.…”
Section: Related Workmentioning
confidence: 99%
“…Hulth (2003) 800 T. Nguyen (2007) 120 X. 308 A.Schutz (2013) 500 M. Krapivin (2009) 680 K. SuNam (2013) 100…”
Section: Referencesmentioning
confidence: 99%