2012
DOI: 10.3923/tb.2012.47.52
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of k-modes-type Algorithms for Clustering Y-Short Tandem Repeats Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 7 publications
0
8
0
Order By: Relevance
“…In all situations the average recovery rates were larger for data generated from the Beta distribution. The best results were achieved for k=2 and degrees 1 ). When k=2 and the degree of overlapping increased to 3 and 4, the ARR values dropped down ranging from 51.07 to 76.46% (Beta data) and 50 to 64.1% (Uniform data).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In all situations the average recovery rates were larger for data generated from the Beta distribution. The best results were achieved for k=2 and degrees 1 ). When k=2 and the degree of overlapping increased to 3 and 4, the ARR values dropped down ranging from 51.07 to 76.46% (Beta data) and 50 to 64.1% (Uniform data).…”
Section: Resultsmentioning
confidence: 99%
“…So me of these algorith ms are d irect variations of the k-modes or fu zzy k-mod es (see [4] an d [5], fo r examp les). The variat ions include changing the dissimilarity measure used to co mpare the objects to the cluster centroids or the cluster centroids themselves by adding mo re informat ion fro m the data set in their defin ition b es id es o f us in g th e hard mod e. The k-representativ es [12] and the k-popu lat ions ( [9], [1]) are exa mp les o f fu zzy k -mo d es v ariatio ns . A mo n g the algorith ms based on different concepts to cluster data we find the h ierarchical method ROCK [17], wh ich uses the number o f links between objects to identify which are neighbors and belong to the same cluster; the entropy-based algorith ms LIMBO [13] and COOLCAT [14]; STIRR [16] based on nonlinear dynamical systems fro m mu ltip le instances of weighted hypergraphs; GoM a parametric procedure based on the assumption that the data follow a mu ltivariate mu ltino mial distribution ( [3], [25])); MADE [3] which uses concepts of the rough set theory to handle uncertainty of the partition; CACTUS [18] based on the idea of co-occurrence between attributes and pairs defin ing a cluster in terms of a cluster´s 2D project ions; the subspace algorith ms SUBCAD [11] and CLICK [7] whose main goal is to locate clusters in different subspaces of the data set with the purpose of overcome the difficulties found in clustering high-dimensional data; many other algorith ms can be found.…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, the Y-STR data and their clustering results have also been published in a journal called Journal of Genetic Genealogy, a journal of genetic genealogical community [13]. A more comprehensive benchmark, involving six Y-STR dataset items and eight existing partitional algorithms, has also been reported [14]. The outcomes of this result indicate that the Y-STR data are quite unique compared to other categorical data, characterizing many similar and almost similar objects.…”
Section: Introductionmentioning
confidence: 99%
“…[13-19] used partitional clustering techniques to group Y-STR data by the number of repeats, a method used in genetic genealogy applications. In this study, we continue efforts to partition the Y-STR data based on the partitional clustering approaches carried out in [13-19]. Recently, we have also evaluated eight partitional clustering algorithms over six Y-STR datasets [19].…”
Section: Introductionmentioning
confidence: 99%
“…In this study, we continue efforts to partition the Y-STR data based on the partitional clustering approaches carried out in [13-19]. Recently, we have also evaluated eight partitional clustering algorithms over six Y-STR datasets [19]. As a result, we found that there is scope to propose a new partitioning algorithm to improve the overall clustering results for the same datasets.…”
Section: Introductionmentioning
confidence: 99%