Background: Chemical surveillance in surface waters is crucial to identify potential threats to the health of freshwater ecosystems. Usually, the concentrations of pollutants are highly variable over the course of the year and often result in non-normally distributed data sets. Therefore, the European Water Framework Directive recommends measuring, e.g. priority substances at least 12 times a year to achieve an acceptable accuracy level for the estimation of the true mean annual loads. However, in Europe priority substances are often measured much less frequently. In this context, the aim of the present study was to analyze how sample size, temporal variability and skewness of the data sets influence the accuracy of the mean annual load estimation and the assessment of annual average environmental quality standards. For this purpose, sample size simulations using weekly composite samples of benzo(a)pyrene, 4-tert-octylphenol, fluoranthene and di(2-ethylhexyl) phthalate, selected as representatives for priority substances, were carried out. Results: The sample size simulations showed two general patterns: the accuracy of the mean annual load estimation increased with increasing sample size and skewness and temporal variability were more apparent in smaller sample sizes. In right-skewed data sets, small sample sizes led, on average, to a systematic underestimation of the true mean annual load whilst in a few cases these led to an overestimation. Although the study was carried out on priority substances, results can be transferable to other pollutants. Furthermore, in small sample sizes a considerable proportion of the simulated means failed to detect annual average environmental quality standard exceedances. Conclusions: The results of the present study indicate that the usage of small sample sizes is likely to result in an underestimation of the true mean annual pollutant loads in chemical surveillance and scientific research, thus potentially jeopardizing the validity of results. Therefore, it is recommended to avoid the usage of small sample sizes for the determination of mean annual pollutant loads. Furthermore, priority substances should be sampled according to the European Water Framework Directive guidelines at least 12 times/year to improve the assessment of the threat posed by pollutants to freshwater ecosystems in Europe.