Ubiquitous Internet of Things (IoT) devices have fueled plenty of innovations in the emerging network paradigms. Among them, IoT edge caching has emerged as a promising technique to cope with the explosive growth in network data traffic, with Quality of Service (QoS) improved and energy saved. However, the intrinsic storage limitations of the edge servers poses a critical challenge for the IoT edge caching system. Enabling edge servers to cooperate with each other can provide a potential perspective to improve the edge storage utilization widely discussed. Nevertheless, it also incurs an additional communication overhead, eventually making the caching system more complex. As a result, how to perform an efficient cooperative caching becomes a critical issue. Thus, in this paper, we propose a deep reinforcement learning-based cooperative edge caching approach, which allows the distributed edge servers to learn to cooperate with each other. Specifically, edge servers determine their cache actions based on the local caching state. After that, the centralized remote server evaluates these actions and feeds back the evaluation results to edge servers for subsequent caching actions optimization. We show that, by designing an appropriate reward function, our approach promotes cooperation between edge servers as well as improving the system hit rate. On this basis, we consider a practical and reasonable scenario with inconsistent data item size and propose a novel multi-agent actor-critic caching algorithm. Extensive simulation results demonstrate the performance improvement using our proposed solution over three other caching algorithms.INDEX TERMS Cooperative edge caching, Internet of Things, multi-agent deep learning, actor-critic, multi-agent deep deterministic policy gradient.