“…For the YOLOv3 model, we test our models on a subset of 5000 images from the COCO2014 [24] dataset, and use the mean average precision (mAP) as measured at 50% intersection over union (IoU), which is denoted as mAP@50, as our accuracy metric. Our results for YOLOv3 are compared with the best published settings of 3 previous scalable codecs [2,3,4], refered to as Choi2022, Harell2022, and Ozyilkan2023. For completeness, we also include two traditional codecs, VVCintra [16] and HEVC-intra [14] (also known as BPG), and the learnable codec of [10] to which we refer to as Cheng2020 3 .…”