S U M M A R YThe estimation of the Green's function between two points on the Earth's surface by the crosscorrelation of seismic noise time-series became a widely used method in seismology. In general, very long time-series (months to years) as well as massive normalization and/or data selection are necessary to obtain useful cross-correlation functions. One task of this study is to evaluate the influence of different established normalization methods on the obtained cross-correlation functions. Furthermore, we evaluate two waveform preserving time domain normalizations as well as a new fully automated data selection approach. The cross-correlation functions analysed in this study are obtained from 12 months of seismic noise recorded in 2004 at five seismic stations in the United States with station distances on a continental scale. For practical reasons, the cross-correlation functions of such long time-series are calculated by stacking the cross-correlation functions obtained from shorter time windows. We use this stacking process for the implementation of the waveform preserving time domain normalizations. The time window length is in general an important parameter of the cross-correlation processing, as it influences the normalization and data selection. Therefore, we evaluate the cross-correlation functions obtained with 47 different time window lengths between one hr and 24 hr. The time domain normalizations intend to suppress the influence of transient signals like earthquake waves as well as long-term (e.g. seasonal) amplitude variations. We compare the proposed waveform preserving time domain normalizations with the established running absolute mean normalization and the one-bit normalization. We demonstrate that a waveform preserving time domain normalization can replace a non-linear time domain normalization, if a time window length similar to the duration of the typically occurring transient signals is used. Next to the time domain normalizations also the spectral whitening in the frequency domain is evaluated. Spectral whitening is a powerful normalization to improve the emergence of broadband signals in seismic noise cross-correlations. Nevertheless, we observe spectral whitening to depend strongly on the time window length. An unwanted amplification of a persistent microseism signal is observed on the continental scale with time windows shorter than 12 hr. Our approach of automated data selection is based on a statistical time-series classification and reliably excludes time windows with transient signals occurring contemporaneously at both sites (e.g. earthquake waves). This data selection approach is capable to replace a nonlinear time domain normalization, but no improvement of the waveform symmetry or the signal-to-noise ratio of the cross-correlation functions is observed in general.