2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER) 2017
DOI: 10.1109/saner.2017.7884609
|View full text |Cite
|
Sign up to set email alerts
|

HDSKG: Harvesting domain specific knowledge graph from content of webpages

Abstract: Knowledge graph is useful for many different domains like search result ranking, recommendation, exploratory search, etc. It integrates structural information of concepts across multiple information sources, and links these concepts together. The extraction of domain specific relation triples (subject, verb phrase, object) is one of the important techniques for domain specific knowledge graph construction. In this research, an automatic method named HDSKG is proposed to discover domain specific concepts and th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
31
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 50 publications
(31 citation statements)
references
References 40 publications
0
31
0
Order By: Relevance
“…Among them, [40]- [42] is to extract structured knowledge from domain-related texts to construct domain knowledge graph. However, HDSKG [44] extracts relational triples from Web pages and then uses a pre-trained SVM classifier and domain dictionary to determine the domain relevance of the extracted triples. MeSH-like [45] extracts entity attributes from medical textbooks, medical online websites, and other domain-related texts using a rule-based approach and fuses them with SimHash-TF-ID algorithm.…”
Section: Construction Of Knowledge Graphmentioning
confidence: 99%
“…Among them, [40]- [42] is to extract structured knowledge from domain-related texts to construct domain knowledge graph. However, HDSKG [44] extracts relational triples from Web pages and then uses a pre-trained SVM classifier and domain dictionary to determine the domain relevance of the extracted triples. MeSH-like [45] extracts entity attributes from medical textbooks, medical online websites, and other domain-related texts using a rule-based approach and fuses them with SimHash-TF-ID algorithm.…”
Section: Construction Of Knowledge Graphmentioning
confidence: 99%
“…The syntactic analysis approach [15] is widely used to extract <subject, predicate, object> triples from sentences, but it isn't applicable for constructing CCBase. It is because entities and property values of coding conventions could not be directly collected from sentences, and also predicates in triples parsed by syntactic analysis approach could not be used as relations in CCBase.…”
Section: B Information Extractionmentioning
confidence: 99%
“…We construct CCBase in a top-down way, extracting entities and relations from unstructured documents guided by ontology. To evaluate the effectiveness of this method, we compare it with two bottom-up extraction methods: the popular open information extraction toolopen IE [20] and a domainspecific extraction method -HDSKG [15]. Three popular metrics are selected: precision, recall and F1 score.…”
Section: A Performance Of Information Extractionmentioning
confidence: 99%
“…These high-complexity/low-volume environments driven by Industry 4.0 trend and techniques represent a very difficult and challenging situation for production companies [12][13][14][15]. In today's practice, small and medium-sized enterprises (SME) can apply smart factory production line solution embedded with IoT and CPS technology to embrace the opportunity of small lot sizes, different product lines with low budgets for automation investments [8,12,[16][17][18]. By equipping this, the gap experienced in today's manufacturing environment can be filled and a kind of highly flexible, productive, high-complexity/low-volume, and SME-friendly can be a new paradigm [8,13].…”
Section: Introductionmentioning
confidence: 99%