Reply to general comments from Referee # 2This paper proposed an approach to monitor flood level trend using DCNN. The topic is very interesting. However, in my opinion, it could be difficult for modelers, decisionmakers and city planners to use the Static Observer Flooding Index (SOFI) directly. The authors should clearly explain what the direct or specific application scenarios of SOFI are. If SOFI or visible area of the flooding can be converted into water depth value or even class information on water depth, it would make this approach more
C1
HESSD
Interactive commentPrinter-friendly version Page 4, Figure 1. There should be a "surveillance images" box between "camera" box and "deep conv. network" box.Thank you for this input. We had several versions of Figure 1, some of which contained the box proposed, but left it out in the final version.Changes: we will reintroduce a "surveillance video" box between "camera" box and "deep conv. neural network" box.
C6
HESSD
Interactive commentPrinter-friendly version Discussion paper 2.4 Comment # 2.4Page 4, Line 18. Why U-net was selected for water segmentation?U-Net is a very well known DCNN architecture for semantic segmentation, as the method has nearly 5000 citations in Google Scholar. It is well-suited to the flooding segmentation problem because of its relatively compact size compared to more recent state-of-the-art architectures (such as Mask-RCNN). The smaller size makes it both easier to train with small datasets (which we have) and faster to run, which is useful for flood monitoring.
Changes:We will include these reasons in the manuscript.
Comment # 2.5Thank you for this question. For the augmented strategy, we use the same images as for the basic strategy. Like for the basic strategy, each training image is fed into the network up to 200 times during training (fewer if the training is completed faster). For the augmented strategy, however, each image is first randomly transformed before being fed into the DCNN.Changes: We will add information about the augmented strategy in Section 2.1.3. Fur-C7
HESSD
Interactive commentPrinter-friendly version Discussion paper thermore, we will provide more details about the augmentation transformations applied.
Comment # 2.6Page 7, Lines 5-6. It is not accurate enough. How many seconds or minutes does each model training take?Changes: At the line indicated, instead of a range we will indicate the approximate average training time in minutes for the basic and augmented strategies separately.2.7 Comment # 2.7Page 8, Table2. The total frames or minutes and the resolutions of surveillance images should be given. How to define the quality of surveillance footage should be explained.