We propose in this work a new method for estimating the main mode of multivariate distributions, with application to eye-tracking calibrations. When performing eye-tracking experiments with poorly cooperative subjects, such as infants or monkeys, the calibration data generally suffer from high contamination. Outliers are typically organized in clusters, corresponding to the time intervals when subjects were not looking at the calibration points. In this type of multimodal distributions, most central tendency measures fail at estimating the principal fixation coordinates (the first mode), resulting in errors and inaccuracies when mapping the gaze to the screen coordinates. Here, we developed a new algorithm to identify the first mode of multivariate distributions, named BRIL, which rely on recursive depth-based filtering. This novel approach was tested on artificial mixtures of Gaussian and Uniform distributions, and compared to existing methods (conventional depth medians, robust estimators of location and scatter, and clustering-based approaches). We obtained outstanding performances, even for distributions containing very high proportions of outliers, both grouped in clusters and randomly distributed. Finally, we demonstrate the strength of our method in a real-world scenario using experimental data from eyetracking calibrations with Capuchin monkeys, especially for distributions where other algorithms typically lack accuracy.