Content caching has been considered by both academia and industry as an efficient solution to tackle the problem of the back-haul becoming the bottleneck in the service of users in future heterogeneous cellular networks. Most of the related caching-oriented studies are based on the content popularity, overlooking the impact of content size on their analysis. In this context, this work studies content caching in an environment where cellular users are equipped with cache memories. In particular, we formulate the content caching as an optimization problem, where the objective is to minimize the average download latency of popular videos through self-caching and device-to-device (D2D) caching and, consequently, increase the network throughput. In addition, in order to solve this problem in real-time scenarios, we introduce a low-complexity utility-based algorithm, which accounts for parameters such as the size and the popularity of the requested contents, as well as the density of the end users. Finally, we provide extensive simulation results that validate our analysis and prove that our innovative scheme outperforms other existing solutions.