In this paper, the problem of proactive caching is studied for cloud radio access networks (CRANs). In the studied model, the baseband units (BBUs) can predict the content request distribution and mobility pattern of each user, determine which content to cache at remote radio heads and BBUs. This problem is formulated as an optimization problem which jointly incorporates backhaul and fronthaul loads and content caching. To solve this problem, an algorithm that combines the machine learning framework of echo state networks with sublinear algorithms is proposed. Using echo state networks (ESNs), the BBUs can predict each user's content request distribution and mobility pattern while having only limited information on the network's and user's state. In order to predict each user's periodic mobility pattern with minimal complexity, the memory capacity of the corresponding ESN is derived for a periodic input. This memory capacity is shown to capture the maximum amount of user information needed for the proposed ESN model. Then, a sublinear algorithm is proposed to determine which content to cache while using limited content request distribution samples. Simulation results using real data from Youku and the Beijing University of Posts and Telecommunications show that the proposed approach yields significant gains, in terms of sum effective capacity, that reach up to 27.8% and 30.7%, respectively, compared to random caching with clustering and random caching without clustering algorithm.