2020
DOI: 10.1111/exsy.12613
|View full text |Cite
|
Sign up to set email alerts
|

A co‐training‐based approach for the hierarchical multi‐label classification of research papers

Abstract: This paper focuses on the problem of the hierarchical multi-label classification of research papers, which is the task of assigning the set of relevant labels for a paper from a hierarchy, using reduced amounts of labelled training data. Specifically, we study leveraging unlabelled data, which are usually plentiful and easy to collect, in addition to the few available labelled ones in a semi-supervised learning framework for achieving better performance results. Thus, in this paper, we propose a semisupervised… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0
8

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(12 citation statements)
references
References 49 publications
0
4
0
8
Order By: Relevance
“…In [15], the author uses a method based on collaborative training to classify papers by layers and labels, then classi es data from two perspectives, and nally, adds the most likely correct classi cation result to the marked data set, which amounts to a semi-supervised classi cation method. In [16], the author uses the LDA topic model to solve the problem of semantic similarity measurement in traditional text classi cation and the K-nearest neighbour algorithm as a classi er employed in the classi cation of samples.…”
Section: Semantic-based Approachmentioning
confidence: 99%
“…In [15], the author uses a method based on collaborative training to classify papers by layers and labels, then classi es data from two perspectives, and nally, adds the most likely correct classi cation result to the marked data set, which amounts to a semi-supervised classi cation method. In [16], the author uses the LDA topic model to solve the problem of semantic similarity measurement in traditional text classi cation and the K-nearest neighbour algorithm as a classi er employed in the classi cation of samples.…”
Section: Semantic-based Approachmentioning
confidence: 99%
“…Standard co‐training algorithms assume that data have two conditionally independent views (i.e., feature spaces), each of which is sufficient to train a classifier. As a general framework, various kinds of classifiers can easily adapt to co‐training (Han et al, 2018; Masmoudi et al, 2021). However, many tasks can hardly satisfy the strong assumption of two compatible and uncorrelated views.…”
Section: Related Workmentioning
confidence: 99%
“…La propuesta del modelo de co-entrenamiento es una extensión del auto-entrenamiento, que permite entrenar a dos o más clasificadores a partir de una base de documentos etiquetados para seudo-etiquetar no etiquetados, el fin es compartir las seudo-etiquetas entre los clasificadores buscando mejorar la precisión de la predicción [26]. Para cada clasificador se busca enfoques distintos de las características de los documentos etiquetados denominados vistas, mientras menos correlacionadas estén las características en las vistas mejor será la predicción, por esta razón a este modelo se lo conoce como modelo multivista y genera entrenamiento a través de una red de aprendizaje [27]. En el Algoritmo 2 se presenta la estructura del modelo:…”
Section: Co-entrenamientounclassified