2016
DOI: 10.1007/978-3-319-49004-5_30
|View full text |Cite
|
Sign up to set email alerts
|

TechMiner: Extracting Technologies from Academic Publications

Abstract: Abstract. In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture 'standard' scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and seman… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 19 publications
0
9
0
Order By: Relevance
“…In general it is possible to generate a set of technologies by i) extracting them from research papers by means of automatic methods [12,13], ii) obtaining them from a manually curated software repository (e.g., the Resource Identification Initiative portal [14]), or iii) getting them from a general knowledge base (e.g., DBpedia [15]). Since the focus of this study is on the analysis of technologies and not on their identification, we created a manually curated set of technologies in the fields of Semantic Web and Artificial Intelligence.…”
Section: Input Knowledge Basesmentioning
confidence: 99%
See 3 more Smart Citations
“…In general it is possible to generate a set of technologies by i) extracting them from research papers by means of automatic methods [12,13], ii) obtaining them from a manually curated software repository (e.g., the Resource Identification Initiative portal [14]), or iii) getting them from a general knowledge base (e.g., DBpedia [15]). Since the focus of this study is on the analysis of technologies and not on their identification, we created a manually curated set of technologies in the fields of Semantic Web and Artificial Intelligence.…”
Section: Input Knowledge Basesmentioning
confidence: 99%
“…We first selected an initial set of about 2,000 technologies by running TechMiner [12] on a set of 3,000 papers in Semantic Web in order to find technologies that were originated or adopted by this field. We then manually cleaned and enriched the resulting dataset by discarding incorrect results and we also added 500 other technologies sourced from Wikipedia pages listing Artificial Intelligence and Machine Learning algorithms and methods.…”
Section: Input Knowledge Basesmentioning
confidence: 99%
See 2 more Smart Citations
“…Literature [20,26,27] shows that training of domain-specific NER/NETs is still an open challenge for two main reasons: (1) the long-tail nature of such entity types, both in existing knowledge bases and in the targeted document collections [22]; and (2) the high cost associated with the creation of hand-crafted rules, or human-labeled training datasets for supervised machine learning techniques. Few approaches addressed these problems by relying on bootstrapping [27] or Entity Expansion [3,11] techniques, achieving promising performance.…”
Section: Introductionmentioning
confidence: 99%