A number of recently published papers have focused on the problem of testing for a unit root in the case where the driving shocks may be unconditionally heteroskedastic. These papers have, however, assumed that the lag length in the unit root test regression is a deterministic function of the sample size, rather than data-determined, the latter being standard empirical practice. In this paper we investigate the finite sample impact of unconditional heteroskedasticity on conventional data-dependent methods of lag selection in augmented Dickey-Fuller type unit root test regressions and propose new lag selection criteria which allow for the presence of heteroskedasticity in the shocks. We show that standard lag selection methods show a tendency to over-fit the lag order under heteroskedasticity, which results in significant power losses in the (wild bootstrap implementation of the) augmented Dickey-Fuller tests under the alternative. The new lag selection criteria we propose are shown to avoid this problem yet deliver unit roots with almost identical finite sample size and power properties as the corresponding tests based on conventional lag selection methods when the shocks are homoskedastic.