State-of-the-art forest models are often complex, analytically intractable, and computationally expensive, due to the explicit representation of detailed biogeochemical and ecological processes. Different models often produce distinct results while predictions from the same model vary with parameter values. In this project, we developed a rigorous quantitative approach for conducting model intercomparisons and assessing model performance. We have applied our original methodology to compare two forest biogeochemistry models, the Perfect Plasticity Approximation with Simple Biogeochemistry (PPA-SiBGC) and Landscape Disturbance and Succession with Net Ecosystem Carbon and Nitrogen (LANDIS-II NECN). We simulated past-decade conditions at flux tower sites located within Harvard Forest, MA, USA (HF-EMS) and Jones Ecological Research Center, GA, USA (JERC-RD). We mined field data available from both sites to perform model parameterization, validation, and intercomparison. We assessed model performance using the following time-series metrics: Net ecosystem exchange, aboveground net primary production, aboveground biomass, C, and N, belowground biomass, C, and N, soil respiration, and species total biomass and relative abundance. We also assessed static observations of soil organic C and N, and concluded with an assessment of general model usability, performance, and transferability. Despite substantial differences in design, both models achieved good accuracy across the range of pool metrics. While LANDIS-II NECN showed better fidelity to interannual NEE fluxes, PPA-SiBGC indicated better overall performance for both sites across the 11 temporal and two static metrics tested (HF-EMS R 2 ÂŻ = 0.73 , + 0.07 , R M S E ÂŻ = 4.68 , â 9.96 ; JERC-RD R 2 ÂŻ = 0.73 , + 0.01 , R M S E ÂŻ = 2.18 , â 1.64 ). To facilitate further testing of forest models at the two sites, we provide pre-processed datasets and original software written in the R language of statistical computing. In addition to model intercomparisons, our approach may be employed to test modifications to forest models and their sensitivity to different parameterizations.