2021
DOI: 10.48550/arxiv.2107.07724
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Active learning for imbalanced data under cold start

Abstract: Labeled data is essential in modern systems that rely on Machine Learning (ML) for predictive modelling. Such systems may suffer from the cold-start problem: supervised models work well but, initially, there are no labels, which are costly or slow to obtain. This problem is even worse in imbalanced data scenarios. Online financial fraud detection is an example where labeling is: i) expensive, or ii) it suffers from long delays, if relying on victims filing complaints. The latter may not be viable if a model ha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…The authors emphasized the importance of aggregating transactions for capturing correlations and found that IF and RF were the best-performing ML models with high detection rates. Other studies have also contributed to the literature by designing multi-stage AL labeling policies (Lorenz et al, 2020;Barata et al, 2021), which initially apply unsupervised AD algorithms to rank the most anomalous transactions for review before switching to supervised AL policies. Barata et al (2021) proposed an intermediate stage using ODAL as a warm-up learner, which efficiently alleviated the cold start scenario with high-class imbalance.…”
Section: Anomaly Detection With Active Learningmentioning
confidence: 99%
See 2 more Smart Citations
“…The authors emphasized the importance of aggregating transactions for capturing correlations and found that IF and RF were the best-performing ML models with high detection rates. Other studies have also contributed to the literature by designing multi-stage AL labeling policies (Lorenz et al, 2020;Barata et al, 2021), which initially apply unsupervised AD algorithms to rank the most anomalous transactions for review before switching to supervised AL policies. Barata et al (2021) proposed an intermediate stage using ODAL as a warm-up learner, which efficiently alleviated the cold start scenario with high-class imbalance.…”
Section: Anomaly Detection With Active Learningmentioning
confidence: 99%
“…Other studies have also contributed to the literature by designing multi-stage AL labeling policies (Lorenz et al, 2020;Barata et al, 2021), which initially apply unsupervised AD algorithms to rank the most anomalous transactions for review before switching to supervised AL policies. Barata et al (2021) proposed an intermediate stage using ODAL as a warm-up learner, which efficiently alleviated the cold start scenario with high-class imbalance. Both studies found that switching to supervised learners improved the models' performance, with Lorenz et al (2020) achieving promising results by matching the performance of a supervised baseline using just 5% of the labels.…”
Section: Anomaly Detection With Active Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Assuming a cold start scenario, it is intended with this work to assess the feasibility of using Anomaly Detection (AD) algorithms [1,5] and Active Learning (AL) techniques [6] to uncover fraudulent patterns in cryptocurrency transactions [7].…”
Section: Objectivesmentioning
confidence: 99%