“…However, computational efficiency is critical in real-world scenarios, where the executed computation is translated into power consumption or carbon emission. Many works have tried on reducing the computational cost of CNNs via neural architecture search [10,16,25,54,57], knowledge distillation [20,55], dynamic routing [4,13,43,51,56] and pruning [15,18], but how to accelerate the ViT model have been rarely explored.…”