during which the fitted classifier is applied to new datapoints with unknown labels. During the fitting-phase, the internal parameters (or model) of a multivariate classifier are adjusted, so that the classifier can statistically distinguish signal and background data-points. The model complexity plays an important role during the fitting-phase and can be controlled by the hyper-parameters of the model. If the model is too simple (too complex) it will be under-fitted (over-fitted) and perform poorly on test data-points with unknown labels.Stochastic gradient-boosted decision trees [8] are widely employed in high energy physics for multivariate classification and regression tasks. The implementation presented in this paper was developed for the Belle II experiment [2], which is located at the SuperKEKB collider in Tsukuba, Japan. Multivariate classification is extensively used in the Belle II Analysis Software Framework (BASF2) [16], for instance during the reconstruction of particle tracks, as part of particle identification algorithms, and to suppress background processes in physics analyses. Often, a large amount of classifiers must be fitted due to hyper-parameter optimization, different background scenarios, to gain improved estimates on the importance of individual features, or to create networks of classifiers which feed into one another. Therefore, Belle II required a default multivariate classification algorithm which is: fast during fitting and application; robust enough to be trained in an automated environment; can be reliably used by non-experts; preferably generates an interpretable model and exhibits a good out-of-the box performance.FastBDT satisfies those requirements and is the default multivariate classification algorithm in BASF2. On the other hand, BASF2 supports other popular multivariate analysis frameworks like TMVA [9], scikit-learn (SKLearn) [19], XGBoost [5] and Tensorflow [1] as well.Abstract Stochastic gradient-boosted decision trees are widely employed for multivariate classification and regression tasks. This paper presents a speed-optimized and cachefriendly implementation for multivariate classification called FastBDT. The concepts used to optimize the execution time are discussed in detail in this paper. The key ideas include: an equal-frequency binning on the input data, which allows replacing expensive floating-point with integer operations, while at the same time increasing the quality of the classification; a cache-friendly linear access pattern to the input data, in contrast to usual implementations, which exhibit a random access pattern. FastBDT provides interfaces to C/ C++, Python and TMVA. It is extensively used in the field of high energy physics (HEP) by the Belle II experiment.