Acoustic-and elastic-waveform inversion is an important and widely used method to reconstruct subsurface velocity image. Waveform inversion is a typical non-linear and ill-posed inverse problem. Existing physics-driven computational methods for solving waveform inversion suffer from the cycle skipping and local minima issues, and not to mention solving waveform inversion is computationally expensive. In recent years, data-driven methods become a promising way to solve the waveform inversion problem. However, most deep learning frameworks suffer from generalization and over-fitting issue. In this paper, we developed a real-time data-driven technique and we call it VelocityGAN, to accurately reconstruct subsurface velocities. Our VelocityGAN is built on a generative adversarial network (GAN) and trained end-to-end to learn a mapping function from the raw seismic waveform data to the velocity image. Different from other encoder-decoder based data-driven seismic waveform inversion approaches, our VelocityGAN learns regularization from data and further impose the regularization to the generator so that inversion accuracy is improved. We further develop a transfer learning strategy based on VelocityGAN to alleviate the generalization issue. A series of experiments are conducted on the synthetic seismic reflection data to evaluate the effectiveness, efficiency, and generalization of VelocityGAN. We not only compare it with existing physics-driven approaches and data-driven frameworks but also conduct several transfer learning experiments. The experiment results show that VelocityGAN achieves state-of-the-art performance among the baselines and can improve the generalization results to some extent.