2018
DOI: 10.1016/j.patcog.2018.06.004
|View full text |Cite
|
Sign up to set email alerts
|

A benchmark and comparison of active learning for logistic regression

Abstract: Logistic regression is by far the most widely used classifier in real-world applications. In this paper, we benchmark the state-of-the-art active learning methods for logistic regression and discuss and illustrate their underlying characteristics. Experiments are carried out on three synthetic datasets and 44 real-world datasets, providing insight into the behaviors of these active learning methods with respect to the area of the learning curve (which plots classification accuracy as a function of the number o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
96
0
6

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 126 publications
(105 citation statements)
references
References 53 publications
3
96
0
6
Order By: Relevance
“…We focus on logistic regression because it has been widely used in applied sciences [11]. We focus on the binary classification case in which there are two classes.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We focus on logistic regression because it has been widely used in applied sciences [11]. We focus on the binary classification case in which there are two classes.…”
Section: Methodsmentioning
confidence: 99%
“…In our comparison on the KDD99 dataset [15], J = 1, C = 2, and we varied B. It has been used to as a standard benchmark dataset for classification algorithms, for example in [5] and [11]. Note that we aim to classify normal and abnormal cases (attacking), hence C = 2.…”
Section: B On Amount Of Transmitted Datamentioning
confidence: 99%
“…A critical aspect for an active learner is represented by the strategy used to query the next sample to be labeled. Four main query frameworks exist, which rely mostly on heuristics: informativeness [58,13,17,4], representativeness [46,48], hybrid [22,57], and performance-based [47,16,12,56]. Among all these, informativeness-based approaches are the most successful ones.…”
Section: Related Workmentioning
confidence: 99%
“…The problem formulation has direct relations to sequential analysis [121] and optimal experimental design [40]. Overviews of current techniques can be found in [19], [111], and [126]. One of the major issues in active learning is that the systematic collection of labeled training data typically leads to a systematic bias as well.…”
Section: Active Learningmentioning
confidence: 99%