2021
DOI: 10.1007/s11192-021-03984-1
|View full text |Cite
|
Sign up to set email alerts
|

Semantic and relational spaces in science of science: deep learning models for article vectorisation

Abstract: Over the last century, we observe a steady and exponential growth of scientific publications globally. The overwhelming amount of available literature makes a holistic analysis of the research within a field and between fields based on manual inspection impossible. Automatic techniques to support the process of literature review are required to find the epistemic and social patterns that are embedded in scientific publications. In computer sciences, new tools have been developed to deal with large volumes of d… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 46 publications
0
6
0
Order By: Relevance
“…Instead of selecting the journals based solely on author's judgement (Liu et al, 2019), we use the Content Analysis Toolkit for Academic Research (CATAR) 1 to apply agglomerative hierarchical clustering and multi-dimensional scaling based on bibliographical coupling similarity for cross-validation. Because the relatedness between scientific publications is multidimensional, no single measurement captures it perfectly; however, automated ways to analyze the semantic contents of articles are crucial as an effective response to the above-shown exponentially-increasing volume of scientific research articles (see Kozlowski et al, 2021). This method allowed us to double-check the topical relatedness of the overall 243 educational journals by analyzing the degree of overlap among the large volume of papers' references in all these educational journals (Small & Koenig, 1977;Tseng & Tsay, 2013;Yan & Ding, 2012).…”
Section: Data Collection and Cleaningmentioning
confidence: 99%
“…Instead of selecting the journals based solely on author's judgement (Liu et al, 2019), we use the Content Analysis Toolkit for Academic Research (CATAR) 1 to apply agglomerative hierarchical clustering and multi-dimensional scaling based on bibliographical coupling similarity for cross-validation. Because the relatedness between scientific publications is multidimensional, no single measurement captures it perfectly; however, automated ways to analyze the semantic contents of articles are crucial as an effective response to the above-shown exponentially-increasing volume of scientific research articles (see Kozlowski et al, 2021). This method allowed us to double-check the topical relatedness of the overall 243 educational journals by analyzing the degree of overlap among the large volume of papers' references in all these educational journals (Small & Koenig, 1977;Tseng & Tsay, 2013;Yan & Ding, 2012).…”
Section: Data Collection and Cleaningmentioning
confidence: 99%
“…Meanwhile, the number of remote sensing research related to in-depth learning increased from 3 in 2015 (accounting for 0.3% of the total number of research) to 98 in 2021 (5.43%), showing a significant upward trend (Figure 8). In recent years, with the soaring development of computer vision, such as image classification, target identification, and semantic segmentation, deep learning has been widely applied to remote sensing and has become an important innovation driver for remote sensing research (Zhang et al, 2016;Leung et al, 2017;Kozlowski et al, 2020).…”
Section: Nomentioning
confidence: 99%
“…This refers to the task of discovering useful representations of scientific articles, typically for use in downstream applications like classification, prediction, and retrieval [4]. These methods rely primarily on Natural Language Processing (NLP) and Network Analysis techniques to learn document embeddings according to a paper's content, a paper's citation relations, or some combination of these sources [12].…”
Section: Introductionmentioning
confidence: 99%
“…Representation learning for scientific documents is the task of representing research papers in some dense vector space, such that the important similarities and relations between the papers are preserved i.e., semantically related papers should have similar representations (or embeddings) [12]. Scientific articles may be related if they pertain to the same topic (i.e., have similar content), or if there is an application or transfer of knowledge from one to the other (i.e., a citation).…”
Section: Introductionmentioning
confidence: 99%