2017
DOI: 10.48550/arxiv.1706.09516
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CatBoost: unbiased boosting with categorical features

Abstract: This paper presents the key algorithmic techniques behind CatBoost, a new gradient boosting toolkit. Their combination leads to CatBoost outperforming other publicly available boosting implementations in terms of quality on a variety of datasets. Two critical algorithmic advances introduced in CatBoost are the implementation of ordered boosting, a permutation-driven alternative to the classic algorithm, and an innovative algorithm for processing categorical features. Both techniques were created to fight a pre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
204
0
3

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 194 publications
(207 citation statements)
references
References 14 publications
0
204
0
3
Order By: Relevance
“…Different implementations of the Gradient Boosted Decision Trees method exist, e.g. XGBoost (Chen and Guestrin, 2016), LightGBM (Ke et al, 2017), CatBoost (Prokhorenkova et al, 2017). We use here LightGBM.…”
Section: Problem Settingsmentioning
confidence: 99%
“…Different implementations of the Gradient Boosted Decision Trees method exist, e.g. XGBoost (Chen and Guestrin, 2016), LightGBM (Ke et al, 2017), CatBoost (Prokhorenkova et al, 2017). We use here LightGBM.…”
Section: Problem Settingsmentioning
confidence: 99%
“…In this study, we will use the data from 2016-2018 to obtain the value of xi k for the rest of the period. Therefore, there will not be any issue with the target leakage problem (Zhang et al, 2013;Prokhorenkova et al, 2017).…”
Section: Competition-dependent Factor and Team-level Historical Recordsmentioning
confidence: 99%
“…We present metrics for the joint evaluation of predictive uncertainty and robustness to distributional shift. We validate our proposed metrics using the baseline Shifts Challenge Gradient Boosted Decision Trees (GBDT) models [15][16].…”
Section: Evaluation Metricsmentioning
confidence: 99%