With the increasing complexity of recent autonomous platforms, there is a strong demand to better utilize system resources while satisfying stringent real-time requirements. Embedded virtualization is an appealing technology to meet this demand. It enables the consolidation of real-time systems with different criticality levels on a single hardware platform by enforcing temporal isolation. On multi-core platforms, however, shared hardware resources, such as caches and memory buses, weaken this isolation. In particular, due to the resulting cache interference, a large last-level cache in recent processors can easily jeopardize the timing predictability of real-time tasks due to cache interference. While researchers in the real-time systems community have developed solutions to tackle this problem, existing cache management schemes reveal two major limitations when used in a clustered multi-core embedded system. The first is the cache co-partitioning problem, which can lead to wrong cache allocation and cache underutilization. The second is the cache interference of inter-virtual-machine (VM) communication because prior work has considered only independent tasks. This paper presents a cluster-aware real-time cache allocation scheme to address these problems. The proposed scheme takes into account the cluster information of the system, and finds the cache allocation that satisfies the timing and memory requirements of tasks. The scheme also maximizes slack time to meet task deadline, which brings flexibility and resilience to unexpected events. Tasks using inter-VM communication are also provided with guaranteed blocking time and cache isolation. We have implemented a prototype of our scheme on an Nvidia TX2 clustered multi-core platform and evaluated the effectiveness of our scheme over cluster-unaware approaches. INDEX TERMS Cache interference, clustered multi-core platforms, real-time systems, embedded virtualization, real-time hypervisor, partitioning hypervisor, real-time resource management.