2018
DOI: 10.1186/s12918-018-0630-6
|View full text |Cite
|
Sign up to set email alerts
|

A multiple kernel density clustering algorithm for incomplete datasets in bioinformatics

Abstract: BackgroundWhile there are a large number of bioinformatics datasets for clustering, many of them are incomplete, i.e., missing attribute values in some data samples needed by clustering algorithms. A variety of clustering algorithms have been proposed in the past years, but they usually are limited to cluster on the complete dataset. Besides, conventional clustering algorithms cannot obtain a trade-off between accuracy and efficiency of the clustering process since many essential parameters are determined by t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…After the creation of these groups, the representative group for patients with anaphylactic shock will have to be identified. More than that, with the multiple kernel density clustering, it is possible to "recover" missing data values from the dataset in attempt to obtain a database as complete as possible [94].…”
Section: Artificial Intelligence Guardshipmentioning
confidence: 99%
“…After the creation of these groups, the representative group for patients with anaphylactic shock will have to be identified. More than that, with the multiple kernel density clustering, it is possible to "recover" missing data values from the dataset in attempt to obtain a database as complete as possible [94].…”
Section: Artificial Intelligence Guardshipmentioning
confidence: 99%
“…Efficient missing value imputation (Patil et al, 2010) Technique is generalized and can be utilized for many data sets (Ishay and Herman, 2015) Impute missing values and build clusters as a unified integrated process (Abdallah and Shimshoni, 2016) K-means þ radial basis function (RBF) Faster convergence speed, higher stability, accuracy (Shi et al, 2018) Local least squares Local data clustering being incorporated for improved quality and efficiency (Keerin et al, 2013) Multiple kernel density Accuracy and efficiency (Liao et al, 2018) Rough set Handles the uncertainty and vagueness existing in data sets (Amiri and Jensen, 2016) Less computational complexity (Azam et al, 2018) Overcome the problem of crispness (Raja et al, 2019) (continued ) Shell neighbor Fills in an incomplete instance in a given data set by only using its left and right nearest neighbors with respect to each factor (attribute) and generalized to deal with data sets of mixed attributes (Zhang, 2011) Sliding window Applicable for IoT devices' data (Kolomvatsos et al, 2019) Soft cluster Overcomes the problems of inconsistency (Raja and Thangavel, 2016) Decision tree Branch-exclusive splits trees (BEST) A new classification procedure that can handle missing values by using data partitioning and better accuracy (Beaulac and Rosenthal, 2020) Boosted trees Able to handle missingness from data fusion, deterministic or distribution-free data sets (D'Ambrosio et al, 2012) C4.5 Generalized approach that uses index measure in the estimation of missing values (Madhu and Rajinikanth, 2012) Classification and regression trees (CART) A robust method to deal with different missing value types (Nikfalazar et al, 2020) Decision trees and forests A higher quality of imputation using similarity and correlations…”
Section: K-meansmentioning
confidence: 99%