Low-cost air quality sensors are a promising supplement to current reference methods for air quality monitoring but can suffer from issues that affect their measurement quality. Interferences from environmental conditions such as temperature, humidity, cross-sensitivities with other gases and a low signal-to-noise ratio make them difficult to use in air quality monitoring without significant time investment in calibrating and correcting their output. Many studies have approached these problems utilising a variety of techniques to correct for these biases. Some use physical methods, removing the variability in environmental conditions, whereas most adopt software corrections. However, these approaches are often not standardised, varying in study duration, measurement frequency, averaging period, average concentration of the target pollutant and the biases that are corrected. Some go further and include features with no direct connection to the measurement such as the level of traffic nearby, converting the initial measurement into a modelled value. Though overall trends in performance can be derived when aggregating the results from multiple studies, they do not always match observations from individual studies, a phenomenon observed across many different academic fields and known as “Simpson’s Paradox”. The preference of performance metrics which utilise the square of the error, such as root mean squared error (RMSE) and r2, over ones which use the absolute error, such as mean absolute error (MAE), makes comparing results between models and studies difficult. Ultimately, comparisons between studies are either difficult or unwise depending on the metrics used, and this literature review recommends that efforts are made to standardise the reporting of calibration and correction studies. By utilising metrics which do not use the square of the error (e.g., MAE), models can be more easily compared within and between studies. By not only reporting the raw error but also the error normalised by multiple factors (including the reference mean and reference absolute deviation), the variabilities induced by environmental factors such as proximity to pollution sources can be minimised.