2020
DOI: 10.1016/j.envsoft.2020.104869
|View full text |Cite
|
Sign up to set email alerts
|

Active learning for anomaly detection in environmental data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
22
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 41 publications
(22 citation statements)
references
References 35 publications
0
22
0
Order By: Relevance
“…For the time series of ecosystem dynamics, we first performed an outlier analysis by excluding values higher than three times the median absolute deviation of all values in a sliding window (Leys et al 2013) of one day window size (15-minute interval = 96 data points). In addition to outlier removal, we visually inspected the data and manually removed anomalous periods from the data (<2%; for more details refer to Russo et al [2020]). After aggregating four measurement points to one per hour (from 96 to 24 data points per day), we calculated mean and coefficient of variation (hereafter CV) of the aggregated data within windows that were sized one week (7 × 24 = 168 data points).…”
Section: Discussionmentioning
confidence: 99%
“…For the time series of ecosystem dynamics, we first performed an outlier analysis by excluding values higher than three times the median absolute deviation of all values in a sliding window (Leys et al 2013) of one day window size (15-minute interval = 96 data points). In addition to outlier removal, we visually inspected the data and manually removed anomalous periods from the data (<2%; for more details refer to Russo et al [2020]). After aggregating four measurement points to one per hour (from 96 to 24 data points per day), we calculated mean and coefficient of variation (hereafter CV) of the aggregated data within windows that were sized one week (7 × 24 = 168 data points).…”
Section: Discussionmentioning
confidence: 99%
“…In assessing algorithm performance, we defer to technician labels as a benchmark. However, the quality control process is subjective (Jones et al, 2018) and data are not perfectly labeled, making reliance on technician labels as a gold standard problematic (Russo et al, 2020). In the LRO data, we identified numerous cases where it was unclear why some data points were labeled and others were not (see Appendix C), which may be due to multiple technicians and evolving protocols, among other reasons.…”
Section: Anomaly Detection Examplementioning
confidence: 99%
“…Another regression technique based on a previous sequence of data is Long Short-Term Memory (LSTM), a class of Artificial Neural Networks (ANNs). Though applications to environmental data anomalies to date are limited, LSTM models have been used to reconstruct time series to detect anomalies in other fields (Hundman et al, 2018;Lindemann et al, 2019;Malhotra et al, 2016;Yin et al, 2020), and other ANN model types have been used for environmental anomaly detection (Hill and Minsker, 2010;Russo et al, 2020). Other algorithms that show promise for time series regression include Prophet, a time series forecasting method developed by Facebook with focus on business applications (Taylor and Letham, 2018), and Hierarchical Temporal Memory (HTM) (Ahmad et al, 2017).…”
Section: A4 Regression Approachesmentioning
confidence: 99%
See 2 more Smart Citations