“…[20][21][22] Segmentation techniques can be based on thresholding, 23,24 morphological operations, 25 edge detection, 15,26 or superpixels 27,28 in combination with connected component labeling while machine learning approaches use trained classifiers in a sliding-window framework [29][30][31] often only applied to independently moving image regions. [32][33][34] To further improve those methods, several approaches exist for spatial information fusion 15,26,31,35,36 and consideration of context knowledge, such as street networks or tracking statistics. 18,25,32,33,[37][38][39] Temporal information fusion, however, is often introduced by using single or multiple object tracking that is based on initial detections.…”