2010 International Symposium on Information Technology 2010
DOI: 10.1109/itsim.2010.5561471
|View full text |Cite
|
Sign up to set email alerts
|

Attribute value weighting in K-modes clustering for Y-short tandem repeats (Y-STR) surname

Abstract: This paper evaluates Y-STR Surname data for attribute value weighting in k-Modes clustering algorithm. Three categories weighting schemas: (1) Relative Value Frequency (RVF); (2) Uncommon Attribute Value Matches STR Surname data. The overall results show that the clustering accuracy of all methods produces in between 40 -44% only. However, the idea of adapting a weighting schema still looks a promising method in order to improve the clustering accuracy for Y-STR data.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
5
0

Year Published

2012
2012
2015
2015

Publication Types

Select...
4

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 11 publications
0
5
0
Order By: Relevance
“…However, where the sample size is large and/or there are multiple samples, multi-criteria analyses such as supervised and unsupervised learning methods produce results that are more informative. Indeed, several methods for grouping multiple samples of Y-STR data automatically have been reported (Schlecht et al, 2008 ; Seman et al, 2010a ; 2012 ; 2013a ). In the supervised learning method, Y-STR data can be classified by haplogroup via the decision tree method (Schlecht et al, 2008 ; Seman et al, 2013a ), Bayesian modeling, and support vector machines (Schlecht et al, 2008 ).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, where the sample size is large and/or there are multiple samples, multi-criteria analyses such as supervised and unsupervised learning methods produce results that are more informative. Indeed, several methods for grouping multiple samples of Y-STR data automatically have been reported (Schlecht et al, 2008 ; Seman et al, 2010a ; 2012 ; 2013a ). In the supervised learning method, Y-STR data can be classified by haplogroup via the decision tree method (Schlecht et al, 2008 ; Seman et al, 2013a ), Bayesian modeling, and support vector machines (Schlecht et al, 2008 ).…”
Section: Introductionmentioning
confidence: 99%
“…In the supervised learning method, Y-STR data can be classified by haplogroup via the decision tree method (Schlecht et al, 2008 ; Seman et al, 2013a ), Bayesian modeling, and support vector machines (Schlecht et al, 2008 ). Similarly, unsupervised learning methods can be used to cluster Y-STR data by similar genetic distances (Seman et al, 2010a ; 2010b ; 2010c ; 2010d ; 2012 ).…”
Section: Introductionmentioning
confidence: 99%
“…The Y-STR data have been applied and used in clustering Y-surname and Y-haplogroup applications. Initial benchmarking results of clustering Y-STR data have been reported (see, e.g., [8][9][10][11][12]). Furthermore, the Y-STR data and their clustering results have also been published in a journal called Journal of Genetic Genealogy, a journal of genetic genealogical community [13].…”
Section: Introductionmentioning
confidence: 99%
“…For example, Schlecht et al [12] used machine learning techniques to classify Y-STR fragments into related groups. Furthermore, Seman et al [13][14][15][16][17][18][19] used partitional clustering techniques to group Y-STR data by the number of repeats, a method used in genetic genealogy applications. In this study, we continue efforts to partition the Y-STR data based on the partitional clustering approaches carried out in [13][14][15][16][17][18][19].…”
mentioning
confidence: 99%
“…Furthermore, Seman et al [13][14][15][16][17][18][19] used partitional clustering techniques to group Y-STR data by the number of repeats, a method used in genetic genealogy applications. In this study, we continue efforts to partition the Y-STR data based on the partitional clustering approaches carried out in [13][14][15][16][17][18][19]. Recently, we have also evaluated eight partitional clustering algorithms over six Y-STR datasets [19].…”
mentioning
confidence: 99%