Emerging video services are associated with stringent quality-of-service (QoS) and high data-rate requirements. Moreover, the presence of data-rate-hungry mobile users in future networks necessitate sophisticated design strategies. The deployment of unmanned aerial vehicle access point (UAP)assisted networks (UANs) has been proposed to ensure high data-rates to mobile users. Moreover, UAPs can be equipped with energy-efficient caches to facilitate video delivery with stringent QoS. However, the mobility of users and UAPs may cause temporal variations in the QoS experienced by users. This paper conducts an extensive performance evaluation of a UAN, by studying the effect of user behavior, mobility of users and UAPs, and a temporal variation of video popularity on the QoS. The QoS is measured in terms of the delay experienced by the users. To that end, a time-dependent queueing model and its associated fluid approximation models are derived, which are illustrated to be reasonably accurate in an appropriate asymptotic regime. A detailed analysis of these models reveals that low delay, i.e., high QoS, can be ensured in UANs. Finally, a reinforcement-learning (RL) approach based on these models is utilized to minimize the number of deployed UAPs and the playout buffer size while guaranteeing a certain QoS.