“…A noteworthy implementation of such a decomposition is the recent DNN 'Uber-Net' (Kokkinos, 2016), which solves 7 vision related tasks (boundary, surface normals, saliency, semantic segmentation, semantic boundary and human parts detection) with a single multi-scale DNN network to reduce the memory footprint. It can be assumed that such a multi-task training improves convergence speed and better generalization to unseen data, something that already has been observed on other multi-task setups related to speech processing, vision and maze navigation (Bilen & Vedaldi, 2016;Caruana, 1998;Dietterich, Hild, & Bakiri, 1990, 1995Mirowski et al, 2016).…”