2022
DOI: 10.48550/arxiv.2205.04980
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ALLSH: Active Learning Guided by Local Sensitivity and Hardness

Abstract: Active learning, which effectively collects informative unlabeled data for annotation, reduces the demand for labeled data. In this work, we propose to retrieve unlabeled samples with a local sensitivity and hardnessaware acquisition function. The proposed method generates data copies through local perturbations and selects data points whose predictive likelihoods diverge the most from their copies. We further empower our acquisition function by injecting the select-worst case perturbation. Our method achieves… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 48 publications
0
1
0
Order By: Relevance
“…Rather than using a fixed MLE-trained model, we derive an objective that trains both the policy and the dynamic model toward maximizing a lower bound of true expected return (simultaneously minimizing the policy evaluation error |J(π, P * ) − J(π, P )|). Several recent works also propose to enhance model training, such as training a reverse dynamic model to encourage conservative model-based imaginations [53], learning a balanced state-action representation for the model to mitigate the distributional discrepancy [54,55], and using advanced dynamic-model architecture of the GPT-Transformer [56][57][58][59] to achieve accurate predictions [60]. These methods are orthogonal to ours and may further improve the performance.…”
Section: Related Workmentioning
confidence: 99%
“…Rather than using a fixed MLE-trained model, we derive an objective that trains both the policy and the dynamic model toward maximizing a lower bound of true expected return (simultaneously minimizing the policy evaluation error |J(π, P * ) − J(π, P )|). Several recent works also propose to enhance model training, such as training a reverse dynamic model to encourage conservative model-based imaginations [53], learning a balanced state-action representation for the model to mitigate the distributional discrepancy [54,55], and using advanced dynamic-model architecture of the GPT-Transformer [56][57][58][59] to achieve accurate predictions [60]. These methods are orthogonal to ours and may further improve the performance.…”
Section: Related Workmentioning
confidence: 99%