The purpose of this paper is to provide a systematic discussion of a generalized barycenter based on a variant of unbalanced optimal transport (UOT) that defines a distance between general non-negative, finitely supported measures by allowing for mass creation and destruction modeled by some cost parameter. They are denoted as Kantorovich–Rubinstein (KR) barycenter and distance. In particular, we detail the influence of the cost parameter to structural properties of the KR barycenter and the KR distance. For the latter we highlight a closed form solution on ultra-metric trees. The support of such KR barycenters of finitely supported measures turns out to be finite in general and its structure to be explicitly specified by the support of the input measures. Additionally, we prove the existence of sparse KR barycenters and discuss potential computational approaches. The performance of the KR barycenter is compared to the OT barycenter on a multitude of synthetic datasets. We also consider barycenters based on the recently introduced Gaussian Hellinger–Kantorovich and Wasserstein–Fisher–Rao distances.
We derive limit distributions for empirical regularized optimal transport distances between probability distributions supported on a finite metric space and show consistency of the (naive) bootstrap. In particular, we prove that the empirical regularized transport plan itself asymptotically follows a Gaussian law. The theory includes the Boltzmann-Shannon entropy regularization and hence a limit law for the widely applied Sinkhorn divergence. Our approach is based on an application of the implicit function theorem to necessary and sufficient optimality conditions for the regularized transport problem. The asymptotic results are investigated in Monte Carlo simulations. We further discuss computational and statistical applications, e.g. confidence bands for colocalization analysis of protein interaction networks based on regularized optimal transport.
We provide a unifying approach to central limit type theorems for empirical optimal transport (OT). In general, the limit distributions are characterized as suprema of Gaussian processes. We explicitly characterize when the limit distribution is centered normal or degenerates to a Dirac measure. Moreover, in contrast to recent contributions on distributional limit laws for empirical OT on Euclidean spaces which require centering around its expectation, the distributional limits obtained here are centered around the population quantity, which is well-suited for statistical applications.At the heart of our theory is Kantorovich duality representing OT as a supremum over a function class F c for an underlying sufficiently regular cost function c. In this regard, OT is considered as a functional defined on ∞ (F c ) the Banach space of bounded functionals from F c to R and equipped with uniform norm. We prove the OT functional to be Hadamard directional differentiable and conclude distributional convergence via a functional delta method that necessitates weak convergence of an underlying empirical process in ∞ (F c ). The latter can be dealt with empirical process theory and requires F c to be a Donsker class. We give sufficient conditions depending on the dimension of the ground space, the underlying cost function and the probability measures under consideration to guarantee the Donsker property. Overall, our approach reveals a noteworthy trade-off inherent in central limit theorems for empirical OT: Kantorovich duality requires F c to be sufficiently rich, while the empirical processes only converges weakly if F c is not too complex.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.