2015
DOI: 10.1504/ijdats.2015.071365
|View full text |Cite
|
Sign up to set email alerts
|

Outlier preservation by dimensionality reduction techniques

Abstract: Sensors are increasingly part of our daily lives: motion detection, lighting control, and energy consumption all rely on sensors. Combining this information into, for instance, simple and comprehensive graphs can be quite challenging. Dimensionality reduction is often used to address this problem, by decreasing the number of variables in the data and looking for shorter representations. However, dimensionality reduction is often aimed at normal daily data, and applying it to events deviating from this daily da… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 44 publications
0
9
0
Order By: Relevance
“…The reason for this is that e.g. normalization [16,86], dimensionality reduction [166], log transformations [167] and data type conversions [70] have all been shown to have significant impact on the presence and detection of anomalies. To be sure, transformations are allowed, but the typology then either assumes the newly derived dataset as the starting point for typification or remains agnostic as to any transformations performed as part of the AD algorithm.…”
Section: Overview Of Anomaly Types and Subtypesmentioning
confidence: 99%
“…The reason for this is that e.g. normalization [16,86], dimensionality reduction [166], log transformations [167] and data type conversions [70] have all been shown to have significant impact on the presence and detection of anomalies. To be sure, transformations are allowed, but the typology then either assumes the newly derived dataset as the starting point for typification or remains agnostic as to any transformations performed as part of the AD algorithm.…”
Section: Overview Of Anomaly Types and Subtypesmentioning
confidence: 99%
“…Differences between the input sequence and the reconstructed output could be highlighted for seqseq, although it would not explain the underlying model. For lstm-ae, we could learn and plot a low dimensional numerical representation based on the internal representation of the network, but dimensionality reduction methods will often produce an output biased towards the average sample of the dataset [34] and must be selected with care. This is the reason why the reconstruction error is used with seqseq to identify anomalies.…”
Section: Interpretabilitymentioning
confidence: 99%
“…It can therefore be concluded that the different heuristics used to boost the time performance of SECODA do not have an adverse effect on its functional ability to detect true anomalies. Table III presents the final algorithm's metrics for the optimal threshold according to the Youden index and Matthews Correlation Coefficient [46,53]. Due to the imbalanced distribution (i.e., few anomalies and many normal cases), several metrics are naturally high.…”
Section: Real-life Dataset With Ground-truthmentioning
confidence: 99%
“…Table III presents the final algorithm's metrics for the optimal threshold according to the Youden index and Matthews Correlation Coefficient [46,53]. Due to the imbalanced distribution (i.e., few anomalies and many normal cases), several metrics are naturally high.…”
Section: Real-life Dataset With Ground-truthmentioning
confidence: 99%