To effectively address the unprecedented acceleration of climate change, cities across the United States are leading efforts to reduce greenhouse gas emissions. Coherent, aggressive, and lasting mitigation policies in controlling carbon emissions are beginning to be adopted to help strengthen climate resilience across different sectors. However, evaluating the effectiveness of current climate legislation requires careful monitoring of emissions through measurable and verifiable means to inform policy decisions. As a part of this effort, we developed a new method to spatially allocate aircraft-based mass balance carbon dioxide (CO2) emissions. In this work, we conducted 7 aircraft flights, performed downwind of New York City (NYC) to quantify CO2 emissions during the nongrowing seasons between 2018 and 2020. We used an ensemble of emission inventories and transport models to calculate the fraction of enhancements (Φ) produced by sources within the policy-relevant boundaries of the 5 NYC boroughs and then applied that to the bulk emissions calculated using the mass balance approach. We derived a campaign-averaged source-apportioned mass balance CO2 emission rate of (57 ± 24) (1σ) kmol/s for NYC. We evaluated the performance of this approach against other top-down methods for NYC including inventory scaling and inverse modeling, with our mean emissions estimate resulting in a 6.5% difference from the average emission rate reported by the 2 complementary approaches. By combining mass balance and transport model approaches, we improve upon traditional mass balance experiment methods to enable quantification of emissions in complex emission environments. We conducted an assessment using an ensemble of emission inventories and transport models to determine the sources of variability in the final calculated emission rates. Our findings indicate that the choice of inventory accounted for 2.0% of the variability in the emission estimates and that the atmospheric transport model contributed 3.9% at the campaign level. Additionally, on average, at the daily scale, the transport model contributed 7.6% and the inventory accounted for 14.1%. The daily flight-to-flight variability contributed a significant portion, at 42.1%. This approach provides a solution to the difficulty of interpreting aircraft-based mass balance results in complex emission environments.