Recent years have witnessed many practical applications of supervised deep learning in seismic processing. However, a weak generalization behavior prevents widespread implementation on large-scale prestack datasets for coherent noise attenuation. This is particularly true when addressing strong near-surface scattered noise in land seismic data. To alleviate this problem, we combine deep learning with an offset-vector tile (OVT) partitioning method to suppress strong scattered noise. With the OVT partitioning method, seismic data are spatially uniformly sampled, offering a favorable foundation for network learning. Specifically, the reflection probability distribution is more stationary than the noise distribution, making the network easier to learn the reflections. Accordingly, we employ the direct signal learning strategy rather than the commonly used residual learning strategy to train the network. To construct high-quality training labels, we adopt the 3D continuous wavelet transform (3D CWT), which can exploit 3D spatial correlation in OVT gathers. General use of these labels can produce similar results as 3D CWT but is highly efficient. To further improve denoising performance, we propose a training sample construction approach that leverages middle-offset OVT volumes with varying azimuths in light of mid-offset relatively high signal-to-noise ratio characteristics. The field data experiment demonstrates that our proposed method also has an excellent generalization ability. Despite only using six middle-offset gathers for training, the well-trained network has been permitted to effectively process 1260 OVTs in a timely manner.