This paper is concerned with the problem of top-K
ranking from pairwise comparisons. Given a collection of n
items and a few pairwise comparisons across them, one wishes to identify the set
of K items that receive the highest ranks. To tackle this
problem, we adopt the logistic parametric model — the Bradley-Terry-Luce
model, where each item is assigned a latent preference score, and where the
outcome of each pairwise comparison depends solely on the relative scores of the
two items involved. Recent works have made significant progress towards
characterizing the performance (e.g. the mean square error for estimating the
scores) of several classical methods, including the spectral method and the
maximum likelihood estimator (MLE). However, where they stand regarding
top-K ranking remains unsettled.
We demonstrate that under a natural random sampling model, the spectral
method alone, or the regularized MLE alone, is minimax optimal in terms of the
sample complexity — the number of paired comparisons needed to ensure
exact top-K identification, for the fixed dynamic range regime.
This is accomplished via optimal control of the entrywise error of the score
estimates. We complement our theoretical studies by numerical experiments,
confirming that both methods yield low entrywise errors for estimating the
underlying scores. Our theory is established via a novel leave-one-out trick,
which proves effective for analyzing both iterative and non-iterative
procedures. Along the way, we derive an elementary eigenvector perturbation
bound for probability transition matrices, which parallels the Davis-Kahan
Θ theorem for symmetric matrices. This also
allows us to close the gap between the l2 error upper bound for the spectral method and
the minimax lower limit.