We study three notions of uncertainty quantification-calibration, confidence intervals and prediction sets-for binary classification in the distribution-free setting, that is without making any distributional assumptions on the data. With a focus towards calibration, we establish a 'tripod' of theorems that connect these three notions for score-based classifiers. A direct implication is that distribution-free calibration is only possible, even asymptotically, using a scoring function whose level sets partition the feature space into at most countably many sets. Parametric calibration schemes such as variants of Platt scaling do not satisfy this requirement, while nonparametric schemes based on binning do. To close the loop, we derive distribution-free confidence intervals for binned probabilities for both fixed-width and uniform-mass binning. As a consequence of our 'tripod' theorems, these confidence intervals for binned probabilities lead to distribution-free calibration. We also derive extensions to settings with streaming data and covariate shift.
Показана эквивалентность двухточечной геометрической модели, излучающей статистически связанные нормальные случайные процессы, и трехточечной неэквидистантной геометрической модели, излучающей статистически не связанные нормальные случайные процессы. Двухточечная модель, излучающая статистически связанные сигналы, представлена в виде суперпозиции двух моделей. Одна из них излучает статистически не связанные сигналы, вторая -полностью коррелированные. Таким образом, была получена трехточечная неэквидистантная геометрическая модель, излучающая статистически не связанные сигналы с одним виртуальным излучателем. Получены аналитические соотношения, позволяющие по параметрам одной модели синтезировать другую. Показано, что для каждой двухточечной модели, излучающей статистически связанные сигналы, можно синтезировать бесконечное множество трехточечных неэквидистантных моделей, отличающихся друг от друга положением одного из излучателей. Сформулированы рекомендации по выбору положения виртуального излучателя трехточечной неэквидистантной модели, обеспечивающей наиболее широкий диапазон управления параметрами угловых шумов модели. Полученные результаты могут быть использованы при синтезе двухточечной геометрической модели, излучающей статистически связанные сигналы и обеспечивающей заданную корреляционную функцию угловых шумов.
No abstract
When deployed in the real world, machine learning models inevitably encounter changes in the data distribution, and certain-but not all-distribution shifts could result in significant performance degradation. In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially, making interventions by a human expert (or model retraining) unnecessary. While several works have developed tests for distribution shifts, these typically either use non-sequential methods, or detect arbitrary shifts (benign or harmful), or both. We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate. In this work, we design simple sequential tools for testing if the difference between source (training) and target (test) distributions leads to a significant drop in a risk function of interest, like accuracy or calibration. Recent advances in constructing time-uniform confidence sequences allow efficient aggregation of statistical evidence accumulated during the tracking process. The designed framework is applicable in settings where (some) true labels are revealed after the prediction is performed, or when batches of labels become available in a delayed fashion. We demonstrate the efficacy of the proposed framework through an extensive empirical study on a collection of simulated and real datasets.
The problem of phase calibration of the matrix simulator is considered. The phase error at the point of reception is divided into systematic and random. Analytic relationships are obtained that allow one to evaluate and compensate for the systematic error in the calibration of the phases of the signals emitted by the matrix simulator, caused by the geometric separation of the phase centers of the antenna and the antenna of the calibration receiver. The random component of the phase error is compensated by the calibration algorithm. Analytical relations are obtained for determining the compensation error due to the non-precise determination of the coordinates of the emitting part of the matrix simulator and the phase center of the antenna of the measuring receiver. The magnitude of this error is determined for the typical location of the antennas of the device under investigation, the measuring receiver and the matrix simulator when performing semi-realistic simulation. The description of the laboratory stand of the developer of the matrix imitator is given. The obtained theoretical results are confirmed experimentally at the booth of the matrix imitator developer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.