Soil health assessments use scoring curves to quantify relationships between soil quality indicators (SQIs) and ecosystem services (ESSs). We evaluated methods for scoring curve development using three labile C pools (β-glucosidase [BG], fluorescein diacetate [FDA] hydrolysis, and permanganate oxidizable carbon [POXC]).Concepts and methods for SQI interpretation used by established frameworks were assessed, with 129 studies reporting relationships to either soil organic C (SOC) (n = 367), a common surrogate for indirect estimation of ESSs, or direct measures of crop yield (n = 88), soil respiration (n = 66), and N 2 O and CH 4 emissions (n = 51). Indirect assessment of BG using SOC and site covariates resolved tillage-based differences (P < .05). Correlations between SOC and SQIs observed under different land uses suggested that use of SOC for indirect scoring would be more effective for FDA than POXC. Direct relationships were generally positive between SQIs and yield (89%), soil respiration (89%), and N 2 O and CH 4 emissions (76%), but such relationships could be nonlinear. Direct assessment revealed that both positive and negative ESS outcomes increased with labile C fraction abundance that complicates the assignment of ESS-based SQI scores. Although direct SQI scores are relatively easy to interpret, relationships between scores and SQIs can vary 10-fold for different sites and cropping systems (upland versus rice paddy), which is much greater than treatment-based differences observed within single sites. To quantify relationships between SQIs and ESS outcomes, one must measure influential site covariates (SQI-ESS covariates) along with details about management, sampling, and analysis.