“…However, when a lightweight model is derived from a compressed version of a complex model, specific training techniques designed for such compressed models, e.g., quantizationaware training [157], can be incorporated in the retraining phase to optimize continuous learning. Moreover, multiple models can share the same backbone to reduce memory consumption [9]. In our case, spatial aggregation in aggregated model training allows multiple video streams to have the same backbone parameters, so that only one copy of model backbone can be maintained in the GPU memory on each edge server, and memory consumption can be further reduced.…”