2017
DOI: 10.1007/978-3-319-71249-9_2
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Top Rank Optimization with Gradient Boosting for Supervised Anomaly Detection

Abstract: In this paper we address the anomaly detection problem in a supervised setting where positive examples might be very sparse. We tackle this task with a learning to rank strategy by optimizing a differentiable smoothed surrogate of the so-called Average Precision (AP). Despite its non-convexity, we show how to use it efficiently in a stochastic gradient boosting framework. We show that using AP is much better to optimize the top rank alerts than the state of the art measures. We demonstrate on anomaly detection… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
27
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
2
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 39 publications
(29 citation statements)
references
References 20 publications
1
27
0
1
Order By: Relevance
“…A more elaborate solution aims at designing differentiable versions of the previous non-smooth measures and optimizing them, e.g. as done by gradient boosting in Fréry et al (2017) with a smooth surrogate of the Mean-AP. The Figure 1: Toy imbalanced dataset: On the left, the Voronoi regions around the positives are small.…”
Section: Introductionmentioning
confidence: 99%
“…A more elaborate solution aims at designing differentiable versions of the previous non-smooth measures and optimizing them, e.g. as done by gradient boosting in Fréry et al (2017) with a smooth surrogate of the Mean-AP. The Figure 1: Toy imbalanced dataset: On the left, the Voronoi regions around the positives are small.…”
Section: Introductionmentioning
confidence: 99%
“…In practice, one would not search over this many different parameters simultaneously using grid search, but pick only the ones deemed most important. In this study we have used Randomized Parameter Optimization, which is the randomized search CV method provided by the scikit-learn [21] library. The Hyperparameter tuning is an intensive optimization problem which can take several hours.…”
Section: Resultsmentioning
confidence: 99%
“…GB and RF contrast in the manner in which the trees are assembled, such as the order and the way the results are combined. Gradient boosting has revealed [21] great performance on real life datasets, especially in ranking exercises due to two major characteristics.…”
Section: Gradient Boosting Regressormentioning
confidence: 99%
“…The gradient boosted regression technique suited our needs for a variety of reasons: it is able to capture non-linear relationships which underlies atmospheric chemistry (Gardner and Dorling, 2000); the decision tree-based machine learning technique is more interpretable than neural net-based models (Kingsford and Salzberg, 2008); it has a relatively quick training time allowing efficient cross validation for tuning of hyper parameters; and it is highly scalable meaning we are able to test on small subsets of the data before increasing to much longer training runs (Torlay et al, 2017). For the work described here we use the XGBoost (Chen and Guestrin, 2016;Frery et al, 2017) algorithm.…”
Section: Developing the Bias Predictormentioning
confidence: 99%