<p><strong>Abstract.</strong> Integration of long-term eddy covariance (EC) flux datasets over regional and global scales requires high degree of comparability of flux data measured at different stations, which entails not only similar-performing instrumentation and their appropriate deployment, but also standardized and reproducible data processing and quality control (QC) procedures. This work focuses on the latter topic and, in particular, on the development of a robust data cleaning procedure. The proposed strategy includes a set of tests aimed at detecting the presence of specific sources of systematic error in the data, as well as an outlier detection procedure aimed at identifying aberrant flux values. Results from tests and outlier detection are integrated in such a way as to leave a large degree of flexibility in the choice of tests and of test threshold values without losing in efficacy and, at the same time, to avoid the use of subjective criteria in the decision rule that specifies whether to retain or reject flux data of dubious quality. Tests development was rooted on advanced time series analysis techniques that consider the stochastic properties of both raw, high-frequency EC data and of flux time series, such as complex dynamics, high persistence and possible presence of stochastic trends. The performance of each proposed test is evaluated by means of Monte Carlo simulations on synthetic datasets, whereas their impact on observed times series was evaluated on a selection of EC datasets distributed by the ICOS research infrastructure. Simulation results evidenced that the proposed tests have a better performance compared to alternative existing QC routines, showing lower false positive and false negative error rates. The application of the proposed tests on real datasets led to an effective cleaning of EC flux data retaining the maximum number of good quality data. Although there is still room for improvement, in particular with the development of new QC tests, we think that the proposed data cleaning procedure can serve as a basis towards a unified QC strategy for EC datasets which i) includes only completely data-driven routines and is therefore suitable for automatic and centralized data processing pipelines, ii) guarantees results reproducibility and iii) is flexible and scalable to accommodate new and additional tests that makes the approach also suitable for other greenhouse gases.</p>