2021
DOI: 10.1162/qss_a_00106
|View full text |Cite
|
Sign up to set email alerts
|

Fine-grained classification of social science journal articles using textual data: A comparison of supervised machine learning approaches

Abstract: We compare two supervised machine learning algorithms—Multinomial Naïve Bayes and Gradient Boosting—to classify social science articles using textual data. The high level of granularity of the classification scheme used and the possibility that multiple categories are assigned to a document make this task challenging. To collect the training data, we query three discipline specific thesauri to retrieve articles corresponding to specialties in the classification. The resulting data set consists of 113,909 recor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(22 citation statements)
references
References 43 publications
1
21
0
Order By: Relevance
“…Scientometric studies usually focus on authorship or measurement of journal or professional association contributions. However, they may also examine terms that appear in titles, abstracts, full texts of book chapters and journal articles, or keywords assigned by editors to published articles or publishing houses [9,[21][22][23]. González-Alcaide et al [24] used scientometric analysis to identify the main research interests and directions on Chagas cardiomyopathy in the MEDLINE database.…”
Section: Literature Reviewmentioning
confidence: 99%
See 4 more Smart Citations
“…Scientometric studies usually focus on authorship or measurement of journal or professional association contributions. However, they may also examine terms that appear in titles, abstracts, full texts of book chapters and journal articles, or keywords assigned by editors to published articles or publishing houses [9,[21][22][23]. González-Alcaide et al [24] used scientometric analysis to identify the main research interests and directions on Chagas cardiomyopathy in the MEDLINE database.…”
Section: Literature Reviewmentioning
confidence: 99%
“…At the same time, it helps academic institutions and scientific literature management platforms analyze the development direction of disciplines [6], facilitates the exploration of knowledge production and dissemination, and accelerates the rapid development of scientific research [7,8]. Acknowledging the advantages of scientometric analysis, it has been widely used to evaluate leading scientific researchers or publications [9], examine the structure of a scientific field's network [10,11], reveal emerging issues [12], and help researchers study the development of research fields and disciplines by categorizing documents along multiple dimensions [4].…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations