2020
DOI: 10.1371/journal.pone.0238835
|View full text |Cite
|
Sign up to set email alerts
|

Analyzing the fine structure of distributions

Abstract: One aim of data mining is the identification of interesting structures in data. For better analytical results, the basic properties of an empirical distribution, such as skewness and eventual clipping, i.e. hard limits in value ranges, need to be assessed. Of particular interest is the question of whether the data originate from one process or contain subsets related to different states of the data producing process. Data visualization tools should deliver a clear picture of the univariate probability density … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
43
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 52 publications
(43 citation statements)
references
References 46 publications
0
43
0
Order By: Relevance
“…The Hellinger point distance measure is selected because clear multimodality is visible in the probability density distribution. Several metrics were investigated using the R package 'parallelDist' and the MD-plot function [10] in the R package 'DataVisualizations'. The detailed mathematical definitions can be found in SI F. The probability density distribution is modeled with a Gaussian mixture model and verified visually with QQplot as described in [43] with the R package 'AdaptGauss'.…”
Section: Distance Selectionmentioning
confidence: 99%
See 4 more Smart Citations
“…The Hellinger point distance measure is selected because clear multimodality is visible in the probability density distribution. Several metrics were investigated using the R package 'parallelDist' and the MD-plot function [10] in the R package 'DataVisualizations'. The detailed mathematical definitions can be found in SI F. The probability density distribution is modeled with a Gaussian mixture model and verified visually with QQplot as described in [43] with the R package 'AdaptGauss'.…”
Section: Distance Selectionmentioning
confidence: 99%
“…In that case, the classes defined by such explanations should contain samples of different environmental states and be based on different processes. The property of relevance is qualitatively evaluated by class mirrored-density plots class (MD plots) [10]. Additionally, statistical testing of class-wise distributions of features can be performed to ensure that the classes defined by rules are tendentially contrastive and, in consequence, relevant.…”
Section: Evaluating the Relevance Of Explanationsmentioning
confidence: 99%
See 3 more Smart Citations