Millimeter-wave vehicular networks incur enormous beam-training overhead to enable narrow-beam communications. This paper proposes a learning and adaptation framework in which the dynamics of the communication beams are learned and then exploited to design adaptive beam-training with low overhead: on a long-timescale, a deep recurrent variational autoencoder (DR-VAE) uses noisy beamtraining observations to learn a probabilistic model of beam dynamics; on a short-timescale, an adaptive beam-training procedure is formulated as a partially observable (PO-) Markov decision process (MDP) and optimized via point-based value iteration (PBVI) by leveraging beam-training feedback and a probabilistic prediction of the strongest beam pair provided by the DR-VAE. In turn, beam-training observations are used to refine the DR-VAE via stochastic gradient ascent in a continuous process of learning and adaptation. The proposed DR-VAE mobility learning framework learns accurate beam dynamics: it reduces the Kullback-Leibler divergence between the ground truth and the learned beam dynamics model by 86% over the Baum-Welch algorithm and by 92% over a naive mobility learningapproach that neglects feedback errors. The proposed dual-timescale approach yields a negligible loss of spectral efficiency compared to a genie-aided scheme operating under error-free feedback and foreknown mobility model. Finally, a low-complexity policy is proposed by reducing the POMDP to an error-robust MDP. It is shown that the PBVI-and error-robust MDP-based policies improve the spectral efficiency by 85% and 67%, respectively, over a policy that scans exhaustively over the dominant beam pairs, and by 16% and 7%, respectively, over a state-of-the-art POMDP policy.
I. INTRODUCTIONMillimeter-wave (mm-wave) has emerged as the most promising technology to enable multigigabit communication in vehicular communications, thanks to large bandwidth availability [2].By operating at carrier frequencies ranging from 30 GHz up to hundreds of GHz, mm-wave communications suffer from high isotropic path loss, overcome by using large antenna arrays