“…Early Exit [12,118,126,161,194,244,245,249,271,282,313] Model Selection [159,191,271,314] Result Cache [13,39,53,92,93,96,108,112,114,123,209,268,293,319] 3.3.1 Model Compression: Model compression techniques facilitate the deployment of resource-hungry AI models into resourceconstrained EDGE servers by reducing the complexity of the DNN. Model compression exploits the sparse nature of gradients' and computation involved while training the DNN model.…”