“…Likewise, recommender systems can leverage similarity caches [110,128]: a recommender system can save operating costs and decrease its response time through recommendation of relevant contents from a cache to user-generated queries, i.e., in case of a cache miss, an application proxy (e.g., YouTube) running close to the helper node (e.g., at a multiaccess edge computing server) can recommend the most related files that are locally cached. More recently, similarity caches have been employed extensively for machine learning based inference systems to store queries and the respective inference results to serve future requests, for example, prediction serving systems [117], image recognition systems [118,119,121], object classification on the cloud [122], caching hidden layer outputs of a neural network to accelerate computation [123], network traffic classification tasks [145]. The cache can indeed respond with the results of a previous query that is very similar to the current one, thus reducing the computational burden and latency of running complex machine learning inference models.…”