Low-cost sensors (LCSs) show a huge potential towards enabling the pervasive and continuous monitoring of crucial environmental parameters, supporting environment preservation and informing citizens' well-being through ubiquitous air quality data. The main drawback of LCSs is that their data is usually biased, even if LCSs are calibrated by their manufacturer at production time. More accurate in-field calibration methods based on machine learning (ML) and neural networks (NNs) are being considered in some recent studies. They typically imply LCSs co-location with reference measurement stations certified by environmental agencies. Due to seasonality effects, however, the correlation between LCSs and their reference may rapidly degrade once the LCSs are moved from the calibration site, making even really accurate calibrations useless. In this work, we specifically target this problem by optimizing the training settings of the most popular ML and NN calibration models for LCSs when a sequential split schema is adopted to separate training-and test-set. Then, we assess the degradation of the calibration over time based on the R 2 score, when the splitting of the dataset between training-and test-set is different from the classical 80% -20% ratio. This method is applied to real data gathered from an O3 sensor deployed in co-location with a certified reference station for a period of 6 months. Eventually, we show that, in the case of Long-Short Term Memory NNs, using 20% of the dataset for the training is a trade-off condition that minimizes the calibration effort and still yields a robust and long-lasting calibration.