“…Then, the instance segments ('thing') and semantic segments ('stuff') [12] are fused by merging modules [52-54, 63, 70, 90, 92] to generate panoptic segmentation. Other proxy-based methods typically start with semantic segments [11,13,16] and group 'thing' pixels into instance segments with various proxy tasks, such as instance center regression [19,42,56,67,80,84,91], Watershed transform [4,8,82], Hough-voting [5,8,51], or pixel affinity [8,29,43,61,77]. DetectoRS [71] achieved the state-ofthe-art in this category with recursive feature pyramid and switchable atrous convolution.…”