2018
DOI: 10.1016/j.talanta.2018.06.061
|View full text |Cite
|
Sign up to set email alerts
|

Uncharted forest: A technique for exploratory data analysis

Abstract: Exploratory data analysis is crucial for developing and understanding classification models from high-dimensional datasets. We explore the utility of a new unsupervised tree ensemble called uncharted forest for visualizing class associations, sample-sample associations, class heterogeneity, and uninformative classes for provenance studies. The uncharted forest algorithm can be used to partition data using random selections of variables and metrics based on statistical spread. After each tree is grown, a tally … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 24 publications
0
2
0
Order By: Relevance
“…EDA aims to comprehend data structure and patterns, serving multiple analytical purposes, including identifying outliers, detecting data deviations, determining crucial variables, suggesting hypotheses, and uncovering hidden patterns [25]. Statistical analyses or graphic representations can be deployed to address hypotheses built with EDA [26]. Specific techniques, such as measures of frequency, central tendency, and dispersion, were applied to the dataset for quantitative data, with the calculation of frequency and proportion applied for qualitative data.…”
Section: Figure 1 Stages Of Studymentioning
confidence: 99%
“…EDA aims to comprehend data structure and patterns, serving multiple analytical purposes, including identifying outliers, detecting data deviations, determining crucial variables, suggesting hypotheses, and uncovering hidden patterns [25]. Statistical analyses or graphic representations can be deployed to address hypotheses built with EDA [26]. Specific techniques, such as measures of frequency, central tendency, and dispersion, were applied to the dataset for quantitative data, with the calculation of frequency and proportion applied for qualitative data.…”
Section: Figure 1 Stages Of Studymentioning
confidence: 99%
“…With increased interactive statistical 12 and defined visualisation tools, 13 visualisation has become inevitable part of EDA. Methods for visualising temporal data 14 and algorithm-based methods, such as Self-Organising Maps (SOM) 15 and uncharted forest-based 16 visualisation techniques, are explored widely in recent years. The generated DAMADICS data set was subjected to three layers of visualisation techniques.…”
Section: Visualisation Of Raw Datamentioning
confidence: 99%