2021
DOI: 10.1007/s10044-021-00977-x
|View full text |Cite
|
Sign up to set email alerts
|

A hybrid reciprocal model of PCA and K-means with an innovative approach of considering sub-datasets for the improvement of K-means initialization and step-by-step labeling to create clusters with high interpretability

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
8
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 60 publications
0
8
0
1
Order By: Relevance
“…PCA and k-means clustering were used in [37] to identify malware cases. The dataset should undergo feature engineering and preparation to identify distinguishing features for different malware instances.…”
Section: Making Clusters Using K-means and Pcamentioning
confidence: 99%
See 1 more Smart Citation
“…PCA and k-means clustering were used in [37] to identify malware cases. The dataset should undergo feature engineering and preparation to identify distinguishing features for different malware instances.…”
Section: Making Clusters Using K-means and Pcamentioning
confidence: 99%
“…., Ck), each characterized by m attributes, into k different clusters. In order to facilitate clustering, Equation (7) reduces the overall linear length [37] among instances of malware X j and their center points µ i .…”
Section: Making Clusters Using K-means and Pcamentioning
confidence: 99%
“…Using this method, the resulting clusters' quality may be evaluated. With a high Silhouette Score, linked malware samples have been effectively categorized, and the resulting clusters are distinctive [47]. The output quality of a clustering method is assessed in malware detection using Silhouette Scoring.…”
mentioning
confidence: 99%
“…Since more than two measured responses are usually studied to achieve a sustainable machining process, the first step is to apply the principal component analysis (PCA) to facilitate performing the clustering step. PCA is a prevalent technique for dimension reduction algorithms [18,19], and it is performed prior to k-means clustering to alleviate the high-dimensionality issue and accelerate computations [20]. The PCA approach converts the measured responses into two main normalized responses (i.e., PCA 1 and PCA 2).…”
mentioning
confidence: 99%
“…The number of clusters must be specified for this algorithm (see the previous step). This technique has been used in a wide variety of applications in a variety of fields [20,22]. The K-means algorithm takes a data set (X) with N samples and separates it into K distinct clusters, with each cluster being defined by the mean of its constituent samples.…”
mentioning
confidence: 99%