The present study offers a twofold contribution on counter-gradient transport (CGT) of turbulent scalar flux. First, by examining turbulent scalar mixing through synchronized particle image velocimetry and planar laser-induced fluorescence on an inclined jet in cross-flow, we clarify the previously unexplained phenomenon of CGT, revealing key flow structures, their spatial distribution and modelling implications. Statistical analysis identifies two distinct CGT regions: local cross-gradient transport in the windward shear layer and non-local effects near the wall after injection. These behaviours are driven by specific flow structures, namely Kelvin–Helmholtz vortices (local) and wake vortices (non-local), suggesting that scalar flux can be decomposed into a gradient-type term for gradient diffusion and a term for large-eddy stirring. Second, we propose a new approach for reconstruction of turbulent mean flow and scalar fields using continuous adjoint data assimilation (DA). By rectifying model-form errors through anisotropic correction under observational constraints, our DA model minimizes discrepancies between experimental measurements and numerical predictions. As expected, the introduced forcing term effectively identifies regions where traditional models fall short, particularly in the jet centreline and near-wall regions, thereby enhancing the accuracy of the mean scalar field. These enhancements occur not only within the observation region but also in unseen regions, underscoring present DA approach's reliability and practicality for reproducing mean flow behaviours from limited data. These findings lay a solid foundation for adjoint-based model-consistent data-driven methods, offering promising potential for accurately predicting complex flow scenarios like film cooling.