Correlation analyses using ecological time series can indicate phenomena such as interspecific interactions or an environmental factor that affects several populations. However, methodological choices in these analyses can significantly impact the results, potentially leading to spurious correlations or missed true associations. In this study, we explore how different decisions affect the performance of statistical tests for correlations between pairs of time series in simulated two-species ecosystems. We show that when performing nonparametric "surrogate data" tests, both the choice of statistic and the method of generating the null distribution can affect true positive and false positive rates. We also show how seemingly closely related methods of accounting for lagged correlation produce vastly different false positive rates. For methods that establish a null model by simulating the dynamics of one of the two species, we show that the choice of species simulated can influence test behavior. Additionally, we identify scenarios where the outcomes of analyses can be highly sensitive to the initial conditions of an ecosystem, even under simple mathematical models. Our results indicate the importance of thoughtful consideration and documentation of the statistical choices investigated here. To make this work broadly accessible, we include visual explanations of most methods tested in an appendix.