“…In particular, for each phase, we perform a local BOP with two steps to tune the parameters of the CIM model: 1) a temporary model is trained with the compressed exemplars as input; and 2) a validation loss on the uncom-pressed new data is computed and the gradients are backpropagated to optimize the parameters of CIM. To evaluate CIM, we conduct extensive experiments by plugging it in recent CIL methods, 3 LUCIR [18], DER [48], and FOSTER [44], on three high-resolution benchmarks, Food-101 [3], ImageNet-100 [18], and ImageNet-1000 [10]. We find that using the compressed exemplars by CIM brings consistent and significant improvements, e.g., 4.2% and 4.8% higher than the SOTA method FOSTER [44], respectively, in the 5-phase and 10-phase settings of ImageNet-1000, with a total memory budget for 5k exemplars.…”