We introduce a new information-theoretic formulation of quantum measurement uncertainty relations, based on the notion of relative entropy between measurement probabilities. In the case of a finitedimensional system and for any approximate joint measurement of two target discrete observables, we define the entropic divergence as the maximal total loss of information occurring in the approximation at hand. For fixed target observables, we study the joint measurements minimizing the entropic divergence, and we prove the general properties of its minimum value. Such a minimum is our uncertainty lower bound: the total information lost by replacing the target observables with their optimal approximations, evaluated at the worst possible state. The bound turns out to be also an entropic incompatibility degree, that is, a good information-theoretic measure of incompatibility: indeed, it vanishes if and only if the target observables are compatible, it is state-independent, and it enjoys all the invariance properties which are desirable for such a measure. In this context, we point out the difference between general approximate joint measurements and sequential approximate joint measurements; to do this, we introduce a separate index for the tradeoff between the error of the first measurement and the disturbance of the second one. By exploiting the symmetry properties of the target observables, exact values, lower bounds and optimal approximations are evaluated in two different concrete examples: (1) a couple of spin-1/2 components (not necessarily orthogonal); (2) two Fourier conjugate mutually unbiased bases in prime power dimension. Finally, the entropic incompatibility degree straightforwardly generalizes to the case of many observables, still maintaining all its relevant properties; we explicitly compute it for three orthogonal spin-1/2 components. arXiv:1608.01986v3 [math-ph] 9 Jan 2018Trivial and sharp observables An observable A is trivial if A = p1 for some probability p, where 1 is the identity of H. In particular, we will make use of the uniform distribution u X on X, u X (x) = 1/ |X|, and the trivial uniform observable U X = u X 1.An observable A is sharp if A(x) is a projection ∀x ∈ X. Note that we allow A(x) = 0 for some x, which is required when dealing with sets of observables sharing the same outcome space. Of course, for every sharp observable we have |{x : A(x) = 0}| ≤ d.Bi-observables and compatible observables When the outcome set has the product form X × Y, we speak of bi-observables. In this case, given the POVM M ∈ M(X × Y), we can introduce also the marginal observablesIn the same way, for p ∈ P(X×Y), we get the marginal probabilities p [1] ∈ P(X) and p [2] ∈ P(Y). Clearly, (M [i] ) ρ = (M ρ ) [i] ; hence there is no ambiguity in writing M ρ [i] for both probabilities. Two observables A ∈ M(X) and B ∈ M(Y) are jointly measurable or compatible if there exists a bi-observable M ∈ M(X × Y) such that M [1] = A and M [2] = B; then, we call M a joint measurement of A and B.Two classical probabilities p...