This paper uses an unusually large dataset to study scatter in site-effect estimation, focusing on how the events that increase uncertainty can be removed from the dataset. Four hundred seventy-three weak motion earthquake records from the surface and bedrock of a 178-m-deep borehole in Aegion, Gulf of Corinth, Greece, are used to evaluate spectral ratios. A simple statistical tool, variance reduction (VR), is first used to identify two groups of events that lie closest and farthest from the average, which is considered here as the initial best estimate of the site response. The scatter in the original dataset is found to be due to the group of events with smallest VR. These events can be removed from the dataset in order to compute a more reliable site response. However, VR is not normally used to choose records for siteeffect studies, and it cannot be applied to the usual small datasets available. The signal-to-noise ratio (SNR) is normally used to this end, for which reason we investigate whether SNR can be used to achieve similar results as VR. Signal-to-noise ratio is estimated using different definitions. Data selection based on SNR is then compared to that using VR in order to define an SNR-based criterion that discriminates against events that, according to VR, increase scatter. We find that defining the SNR of a surface record as the mean value over a frequency range around the resonant peak (here, 0.5-1.5 Hz) and using a cutoff value of 5 may be used in this case to exclude most events for which VR is small. This process is also applied to the downhole station, where we obtain similar results for a cutoff value of 3.