“…After automatically detecting all elements, the DEV dataset was manually classified into 12 classes (Figure 2), depending on the USVs' spectro-temporal features (Hanson et al, 2012;Musolf et al, 2015;Nicolakis et al, 2020;Scattoni et al, 2008;Zala et al, 2020) (Table 2 in Supplementary materials). These classes are based on frequency changes (Zala et al, 2020) (> 5 kHz increase "up", > 5 kHz decrease "d"), on the number of components (corresponding to breaks in the frequency track; "c2" with 2 and "c3" with 3 components), on changes of frequency direction (≥ 2 changes "c") or shape (u-shape, "u", u-inverted shape, "ui"), on frequency modulation (< 5kHz, "f"), on time (5-10 ms, "s", < 5ms, "us"), and harmonic elements, "h". It is worth noting that there are 2 more USV classes, USVs with 4 "c4" and 5 "c5" components.…”