2018
DOI: 10.1007/s10115-018-1280-0
|View full text |Cite
|
Sign up to set email alerts
|

Multi-label dataless text classification with topic modeling

Abstract: Manually labeling documents is tedious and expensive, but it is essential for training a traditional text classifier. In recent years, a few dataless text classification techniques have been proposed to address this problem. However, existing works mainly center on single-label classification problems, that is, each document is restricted to belonging to a single category. In this paper, we propose a novel Seed-guided Multi-label Topic Model, named SMTM. With a few seed words relevant to each category, SMTM co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
21
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 35 publications
(21 citation statements)
references
References 35 publications
0
21
0
Order By: Relevance
“…This section briefly discusses the preliminary study conducted to compile the domain knowledge related to PEN model traits [22]. The motivation to compile the seed words aggregated to personality models was raised due to the representation power of such terms in revealing the demographical of the topical categories [5]. In this sense, we adopted a mechanism called Automatic Personality Perceptions (APP) and executed a survey to gather public perception towards the list of the sentiment words extracted from myPersonality using Part-of-Speech Tagging elements.…”
Section: Overview Of the Preliminary Studymentioning
confidence: 99%
See 1 more Smart Citation
“…This section briefly discusses the preliminary study conducted to compile the domain knowledge related to PEN model traits [22]. The motivation to compile the seed words aggregated to personality models was raised due to the representation power of such terms in revealing the demographical of the topical categories [5]. In this sense, we adopted a mechanism called Automatic Personality Perceptions (APP) and executed a survey to gather public perception towards the list of the sentiment words extracted from myPersonality using Part-of-Speech Tagging elements.…”
Section: Overview Of the Preliminary Studymentioning
confidence: 99%
“…Because language and personality are strongly correlated, the use of seed words to categorize the topics according to human traits could be an alternative to overcome the inherent problems of collecting the personality-based datasets as the efforts to collect meaningful seed words are much cheaper and easier [5]. The recent literature also showed that incorporating seed knowledge to guide the auto-modeling is distributed in many aspects such as long documents [6], event detection and mapping [7], and unsupervised error estimation on various natural language text corpora [8].…”
Section: Introductionmentioning
confidence: 99%
“…Additionally, there are many previous supervised topic models that directly incorporate the supervision information into the unsupervised versions, including both works on single-label learning [21,14,38,39,35,36,29,24,26] and multi-label classification [32,27,28,30,13,17,25,37,1]. These models have empirically achieved very competitive classification performance, however, they require labeled documents as inputs.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, dataless text classification, i.e., a new paradigm of weakly supervised learning, has attracted increasing attention from the community [20,4,7,8,12,11,5,15,18,16,22,37,31,23]. The target is to build text classifiers by training over unlabeled documents with predefined representative words of categories (called seed words), instead of labeled documents.…”
Section: Introductionmentioning
confidence: 99%
“…It instead aligns anomaly patterns with human interests by leveraging human feedback Semi-supervised anomaly detection. Semi-supervised learning methods [27], [28] have been studied in the context of anomaly detection. Semi-supervised anomaly detection assumes that a small set of labeled instances can be used to improve the performance [29].…”
Section: Related Workmentioning
confidence: 99%