Maddah-Ali and Niesen (MAN) in 2014 showed that coded caching in single bottleneck-link broadcast networks allows serving an arbitrarily large number of cache-equipped users with a total link load (bits per unit time) that does not scale with the number of users. Since then, the general topic of coded caching has generated enormous interest both from the information theoretic and (network) coding theoretic viewpoint, and from the viewpoint of applications. Building on the MAN work, this paper considers a particular network topology referred to as cache-aided Fog Radio Access Network (Fog-RAN), that includes a Macro-cell Base Station (MBS) co-located with the content server, several cache-equipped Small-cell Base Stations (SBSs), and many users without caches. Some users are served directly by the MBS broadcast downlink, while other users are served by the SBSs. The SBSs can also exchange data via rounds of direct communication via a side channel, referred to as "sidelink". For this novel Fog-RAN model, the fundamental tradeoff among (a) the amount of cache memory at the SBSs, (b) the load on the downlink (from MBS to directly served users and SBSs), and (c) the aggregate load on the sidelink is studied, under the standard worst-case demand scenario. Several existing results are recovered as special cases of this network model and byproduct results of independent interest are given. Finally, the role of topology-aware versus topology-agnostic caching is discussed.