High-quality labelled datasets represent a cornerstone in the development of deep learning models for land use classification. The high cost of data collection, the inherent errors introduced during data mapping efforts, the lack of local knowledge, and the spatial variability of the data hinder the development of accurate and spatially-transferable deep learning models in the context of agriculture. In this paper, we investigate the use of Isolation Forest (IF), an anomaly detection algorithm, to reduce noise in a large-scale, low-resolution alternative ground truth dataset used to train land use deep learning models. We use a modestsize, high-resolution and high-fidelity manually collected ground-truth dataset to calibrate Isolation Forest parameters and evaluate our approach, highlighting the relatively low cost of the methodology. Our data-centric methodology demonstrates the efficacy of deep learning methods coupled with IF to create mid-resolution land-use models and map products for agriculture using an alternative groundtruth dataset. Moreover, we compare our deep learning approach with a traditional algorithm used in remote sensing and evaluate the spatial transferability of the created models. Finally, we reflect upon the lessons learnt and future work.