Artificial intelligence (AI) has witnessed a substantial breakthrough in a variety of Internet of Things (IoT) applications and services, spanning from recommendation systems and speech processing applications to robotics control and military surveillance. This is driven by the easier access to sensory data and the enormous scale of pervasive/ubiquitous devices that generate zettabytes (ZB) of real-time data streams. Designing accurate models using such data streams, to predict future insights and revolutionize the decision-taking process, inaugurates pervasive systems as a worthy paradigm for a better quality-oflife. The confluence of pervasive computing and artificial intelligence, Pervasive AI, expanded the role of ubiquitous IoT systems from mainly data collection to executing distributed computations with a promising alternative to centralized learning, presenting various challenges, including privacy concerns, scalability, and latency requirements. In this context, a wise cooperation and resource scheduling should be envisaged among IoT devices (e.g., smartphones, smart vehicles) and infrastructure (e.g. edge nodes, and base stations) to avoid communication and computation overheads and ensure maximum performance. In this paper, we conduct a comprehensive survey of the recent techniques and strategies developed to overcome these resource challenges in pervasive AI systems. Specifically, we first present an overview of the pervasive computing, its architecture, and its intersection with artificial intelligence. We then review the background, applications and performance metrics of AI, particularly Deep Learning (DL) and online learning, running in a ubiquitous system. Next, we provide a deep literature review of communication-efficient techniques, from both algorithmic and system perspectives, of distributed inference, training and online learning tasks across the combination of IoT devices, edge devices and cloud servers. Finally, we discuss our future vision and research challenges.