“…More recently, supervised learning-based approaches [25], and particularly those based on the use of Convolutional Neural Networks (CNN), have shown enormous potential for tackling tasks that require a dense, per-pixel prediction, such as semantic segmentation [27,28], instance segmentation [29] or crowd counting via density map estimation [30]. Blur segmentation can also be viewed as one of such dense prediction tasks, and several works have already explored this approach, either for predicting both types [31,1,32] or defocus only blur [33,34,35,36]. Nevertheless, the performance gain obtained by these fully-supervised, CNN-based approaches trained end-toend is relatively modest when compared to gains in other fields.…”