2013
DOI: 10.1145/2513092.2513094
|View full text |Cite
|
Sign up to set email alerts
|

Batch Mode Active Sampling Based on Marginal Probability Distribution Matching

Abstract: Active Learning is a machine learning and data mining technique that selects the most informative samples for labeling and uses them as training data; it is especially useful when there are large amount of unlabeled data and labeling them is expensive. Recently, batch-mode active learning, where a set of samples are selected concurrently for labeling, based on their collective merit, has attracted a lot of attention. The objective of batch-mode active learning is to select a set of informative samples so that … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
39
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 58 publications
(39 citation statements)
references
References 31 publications
0
39
0
Order By: Relevance
“…The second category is to find the most representative points for the overall patterns of the unlabeled data while preserving the data distribution [17,18]. In particular, clustering for better sampling representative points has been explored [3,4].…”
Section: Related Workmentioning
confidence: 99%
“…The second category is to find the most representative points for the overall patterns of the unlabeled data while preserving the data distribution [17,18]. In particular, clustering for better sampling representative points has been explored [3,4].…”
Section: Related Workmentioning
confidence: 99%
“…The other details proofs and theorem about the two-sample discrepancy problem can refer to [58,60]. Suppose we find a density function F in the theorem 2, following [49],the empirical estimate of distribution discrepancy between X and Z with the samples x i ∈ X and the samples z i ∈ Z can be defined as follows:…”
Section: The Proposed Frameworkmentioning
confidence: 99%
“…The performance of clustering based methods depends on how well the clustering structure can represent the entire data structure. The other is optimal experimental design methods [49][50][51], which try to query the representative examples in a transductive manner. The major problem of experimental design based methods is that a large number of samples need to be accessed before the optimal decision boundary is found, while the informativeness of the query samples is almost ignored.…”
Section: Introductionmentioning
confidence: 99%
“…It has been applied in transfer learning and active learning on image classification problems. Chattopadhyay et al proposed an active learning method based on MMD by minimizing the distance between the distributions of labelled and unlabelled data [21]. Gong et al introduced a method to select the landmark data of source domain according to its distance to target domain [22].…”
Section: Related Workmentioning
confidence: 99%