2016
DOI: 10.1016/j.knosys.2016.01.013
|View full text |Cite
|
Sign up to set email alerts
|

A relevant subspace based contextual outlier mining algorithm

Abstract: For high-dimensional and massive data sets, a relevant subspace based contextual outlier detection algorithm is proposed. Firstly, the relevant subspace, which can effectively describe the local distribution of the various data sets, is redefined by using local sparseness of attribute dimensions. Secondly, a local outlier factor calculation formula in the relevant subspace is defined with probability density of local data sets, and the formula can effectively reflect the outlier degree of data object that does… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 33 publications
(10 citation statements)
references
References 24 publications
0
10
0
Order By: Relevance
“…For subspace learning-based outlier methods, the key is to find the relevant outliers by sifting through different subsets of dimensions in the data in an ordered way. Generally, these methods can be divided into two categories: the sparse subspace methods 17,18,19 and the relevant subspace methods 20,21,22,23,24,25,26 .…”
Section: Subspace Learning-based Methodsmentioning
confidence: 99%
“…For subspace learning-based outlier methods, the key is to find the relevant outliers by sifting through different subsets of dimensions in the data in an ordered way. Generally, these methods can be divided into two categories: the sparse subspace methods 17,18,19 and the relevant subspace methods 20,21,22,23,24,25,26 .…”
Section: Subspace Learning-based Methodsmentioning
confidence: 99%
“…An alternative to use neighbor-based methods is to detect anomalies taken into account that they are grouped in a zone of the data space. Thus, the anomaly detection problem is solved as a subspace learning problem [33][34][35][36]. Although this method work well in some cases, finding the number of subspaces in which the anomalies are distributed is not trivial.…”
Section: Related Workmentioning
confidence: 99%
“…Based on the above characteristics of data with doublepeaked emission lines, data preprocessing is worked out before experiments. A characteristic extraction method based on relevant subspace (RS) is used to obtain the characteristics related with double peaks [1], [36], [58]. A dataset including 345 spectra with double-peaked emission lines, which are known and identified currently, is selected from LAMOST DR5 to extract the useful characteristics [14], [15].…”
Section: Experiments Analysis a Data Preprocessingmentioning
confidence: 99%