We consider bootstrap-based testing for threshold effects in non-linear threshold autoregressive (TAR) models. It is well-known that classic tests based on asymptotic theory tend to be oversized in the case of small, or even moderate sample sizes, or when the estimated parameters indicate non-stationarity, as often witnessed in the analysis of financial or climate data. To address the issue we propose a supremum Lagrange Multiplier test statistic (sLMb), where the null hypothesis specifies a linear autoregressive (AR) model against the alternative of a TAR model. We consider a recursive bootstrap applied to the sLMb statistic and establish its validity. This result is new, and requires the proof of non-standard results for bootstrap analysis in time series models; this includes a uniform bootstrap law of large numbers and a bootstrap functional central limit theorem. These new results can also be used as a general theoretical framework that can be adapted to other situations, such as regime-switching processes with exogenous threshold variables, or testing for structural breaks. The Monte Carlo evidence shows that the bootstrap test has correct empirical size even for small samples, and also no loss of empirical power when compared to the asymptotic test. Moreover, its performance is not affected if the order of the autoregression is estimated based on information criteria. Finally, we analyse a panel of short time series to assess the effect of warming on population dynamics.