Data-driven approaches to identify geophysical signals have proven beneficial in high dimensional environments where model-driven methods fall short. GNSS offers a source of unsaturated ground motion observations that are the data currency of ground motion forecasting and rapid seismic hazard assessment and alerting. However, these GNSS-sourced signals are superposed onto hardware-, location- and time-dependent noise signatures influenced by the Earth’s atmosphere, low-cost or spaceborne oscillators, and complex radio frequency environments. Eschewing heuristic or physics based models for a data-driven approach in this context is a step forward in autonomous signal discrimination. However, the performance of a data-driven approach depends upon substantial representative samples with accurate classifications, and more complex algorithm architectures for deeper scientific insights compound this need. The existing catalogs of high-rate (≥1Hz) GNSS ground motions are relatively limited. In this work, we model and evaluate the probabilistic noise of GNSS velocity measurements over a hemispheric network. We generate stochastic noise time series to augment transferred low-noise strong motion signals from within 70 kilometers of strong events (≥ MW 5.0) from an existing inertial catalog. We leverage known signal and noise information to assess feature extraction strategies and quantify augmentation benefits. We find a classifier model trained on this expanded pseudo-synthetic catalog improves generalization compared to a model trained solely on a real-GNSS velocity catalog, and offers a framework for future enhanced data driven approaches.