“…Multi-layer feature utilization has been demonstrated to be an effective method of making full use of the information contained in different layers of the model to improve the representation and generalization capabilities of computer vision [23,30,42,44,50,59,81,86,92], natural language processing [1,3,11,12,26,53,64,69,76,79] and multi-modal models [13,49].…”