[1] Traditional inversion techniques applied to the problem of characterizing the thermal and compositional structure of the upper mantle are not well suited to deal with the nonlinearity of the problem, the trade-off between temperature and compositional effects on wave velocities, the nonuniqueness of the compositional space, and the dissimilar sensitivities of physical parameters to temperature and composition. Probabilistic inversions, on the other hand, offer a powerful formalism to cope with all these difficulties, while allowing for an adequate treatment of the intrinsic uncertainties associated with both data and physical theories. This paper presents a detailed analysis of the two most important elements controlling the outputs of probabilistic (Bayesian) inversions for temperature and composition of the Earth's mantle, namely the a priori information on model parameters, (m), and the likelihood function, L(m). The former is mainly controlled by our current understanding of lithosphere and mantle composition, while the latter conveys information on the observed data, their uncertainties, and the physical theories used to relate model parameters to observed data.[2] The benefits of combining specific geophysical datasets (Rayleigh and Love dispersion curves, body wave tomography, magnetotelluric, geothermal, petrological, gravity, elevation, and geoid), and their effects on L(m), are demonstrated by analyzing their individual and combined sensitivities to composition and temperature as well as their observational uncertainties. The dependence of bulk density, electrical conductivity, and seismic velocities to major-element composition is systematically explored using Monte Carlo simulations. We show that the dominant source of uncertainty in the identification of compositional anomalies within the lithosphere is the intrinsic nonuniqueness in compositional space. A general strategy for defining (m) is proposed based on statistical analyses of a large database of natural mantle samples collected from different tectonic settings (xenoliths, abyssal peridotites, ophiolite samples, etc.). This strategy relaxes more typical and restrictive assumptions such as the use of local/limited xenolith data or compositional regionalizations based on age-composition relations. We demonstrate that the combination of our (m) with a L(m) that exploits the differential sensitivities of specific geophysical observables provides a general and robust inference platform to address the thermochemical structure of the lithosphere and sublithospheric upper mantle. An accompanying paper deals with the integration of these two functions into a general 3-D multiobservable Bayesian inversion method and its computational implementation.