2018
DOI: 10.3938/npsm.68.647
|View full text |Cite
|
Sign up to set email alerts
|

Build Up of a Subject Classification System from Collective Intelligence

Abstract: Systematized subject classification is essential for funding and assessing scientific projects. Conventionally, classification schemes are founded on the empirical knowledge of the group of experts; thus, the experts' perspectives have influenced the current systems of scientific classification. Those systems archived the current state-of-art in practice, yet the global effect of the accelerating scientific change over time has made the updating of the classifications system on a timely basis vertually impossi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…Categories are generally located at the end of a Wikipedia page and are designed to link related entries under a shared topic to make navigation easier. For the knowledge network, we designated a node as a category or page, treated as a proxy of a subject or scientific notion, and if there was a hyperlink from a subject to another subject, we assigned a directed link [29]. We considered categories and pages to be identical when they shared an identical name (e.g., category:science and page:science); thus, we merged them into a single node with inheriting connected links.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Categories are generally located at the end of a Wikipedia page and are designed to link related entries under a shared topic to make navigation easier. For the knowledge network, we designated a node as a category or page, treated as a proxy of a subject or scientific notion, and if there was a hyperlink from a subject to another subject, we assigned a directed link [29]. We considered categories and pages to be identical when they shared an identical name (e.g., category:science and page:science); thus, we merged them into a single node with inheriting connected links.…”
Section: Resultsmentioning
confidence: 99%
“…Researchers also show that one can extract geopolitical ties from the shared interest in Wikipedia's hyperlink structure [24]. Several studies find that Wikipedia category data is a rich source of accurate knowledge [25,26,27,28,29], and thus, the Wikipedia category can be used as a good proxy for the knowledge structure by building flexible subject categories. In summary, Wikipedia is an abundant source of knowledge that one can use to examine the structure of knowledge in general society.…”
Section: Introductionmentioning
confidence: 99%
“…A smaller number of papers study Wikipedia's potential in identification of thematic structures in science. Salah et al [33] [37] construct a classification scheme of science and technology by extracting its backbone from Wikipedia, using the nodes reachable from the Scientific disciplines category. To extract the backbone of the network, pruning of insignificant links using the shortest path information and reduction using local structural information is done.…”
Section: Related Workmentioning
confidence: 99%
“…On the contrary to the common use of unsupervised machine learning methods, this work is based on supervised methods, incorporating the ''ground truth'' knowledge from an expert classification scheme into the training/test data. Most of the related work based on Wikipedia utilizes the article interlinks or the category graph in conjunction with network analyses to identify articles/categories referring to disciplines or scientific concepts [33], [34], [37]. Those that use machine learning algorithms to classify Wikipedia articles as ''appropriate'' or not in a specific context, train their models on a smaller number of manually engineered features and smaller datasets compared to the method presented in this work, which in its core module uses automatically extracted features of larger dimension and larger training/test datasets.…”
Section: Related Workmentioning
confidence: 99%