“…For the comparison, Table 4 shows the similarity/dissimilarity between Human and other species in some other methods similarly taking the Euclidean distance as the measurement. From Table 4 , finding that most listed methods 1 , 2 , 10 , 46 , 47 also make the same conclusion that Gorilla are the most similar species to Human and Chimp is the next similar species to Human except method 33 which make the similar conclusion that Chimp is the most similar species to Human and Gorilla is the next similar species to Human. Besides, some listed methods 2 , 10 , 47 also make the same conclusion that Gallus is the most dissimilar species to Human.…”
Section: Resultsmentioning
confidence: 59%
“…2 . As seen, the similar cluster pairs are respectively as Human-Gorilla(same cluster result in 1 , 2 , 10 , 40 , 47 , 48 ), Rat-Mouse(same result in 10 , 49 ), Lemur-Rabbit(same cluster result in 10 ), Goat-Bovine(same cluster result in 10 , 33 , 40 , 47 – 49 ), Human-Gorilla-Chimpanzee (same cluster result in 1 , 10 , 33 , 40 , 46 , 48 , 49 ). …”
Section: Resultsmentioning
confidence: 67%
“…2006 46 0.0120 0.0155 0.0704 0.0543 0.0603 0.0287 0.0169 0.0276 0.1389 0.1146 Liu and Wang 2006 47 0.3070 0.3101 0.4256 0.3089 0.3688 0.2968 0.4341 0.4172 0.3805 0.4479 Liao et al . 2013 2 0.1651 0.4688 0.9202 0.6024 1.0110 0.7453 0.6010 0.6320 1.3710 1.5932 Jafarzadeh et al . 2013 1 0.0330 0.0920 0.2160 0.1630 0.1940 0.1240 0.1650 0.2210 0.1940 0.1940 Bielinska-Waz et al .…”
One novel representation of DNA sequence combining the global and local position information of the original sequence has been proposed to distinguish the different species. First, for the sufficient exploitation of global information, one graphical representation of DNA sequence has been formulated according to the curve of Fermat spiral. Then, for the consideration of local characteristics of DNA sequence, attaching each point in the curve of Fermat spiral with the related mass has been applied based on the relationships of neighboring four nucleotides. In this paper, the normalized moments of inertia of the curve of Fermat spiral which composed by the points with mass has been calculated as the numerical description of the corresponding DNA sequence on the first exons of beta-global genes. Choosing the Euclidean distance as the measurement of the numerical descriptions, the similarity between species has shown the performance of proposed method.
“…For the comparison, Table 4 shows the similarity/dissimilarity between Human and other species in some other methods similarly taking the Euclidean distance as the measurement. From Table 4 , finding that most listed methods 1 , 2 , 10 , 46 , 47 also make the same conclusion that Gorilla are the most similar species to Human and Chimp is the next similar species to Human except method 33 which make the similar conclusion that Chimp is the most similar species to Human and Gorilla is the next similar species to Human. Besides, some listed methods 2 , 10 , 47 also make the same conclusion that Gallus is the most dissimilar species to Human.…”
Section: Resultsmentioning
confidence: 59%
“…2 . As seen, the similar cluster pairs are respectively as Human-Gorilla(same cluster result in 1 , 2 , 10 , 40 , 47 , 48 ), Rat-Mouse(same result in 10 , 49 ), Lemur-Rabbit(same cluster result in 10 ), Goat-Bovine(same cluster result in 10 , 33 , 40 , 47 – 49 ), Human-Gorilla-Chimpanzee (same cluster result in 1 , 10 , 33 , 40 , 46 , 48 , 49 ). …”
Section: Resultsmentioning
confidence: 67%
“…2006 46 0.0120 0.0155 0.0704 0.0543 0.0603 0.0287 0.0169 0.0276 0.1389 0.1146 Liu and Wang 2006 47 0.3070 0.3101 0.4256 0.3089 0.3688 0.2968 0.4341 0.4172 0.3805 0.4479 Liao et al . 2013 2 0.1651 0.4688 0.9202 0.6024 1.0110 0.7453 0.6010 0.6320 1.3710 1.5932 Jafarzadeh et al . 2013 1 0.0330 0.0920 0.2160 0.1630 0.1940 0.1240 0.1650 0.2210 0.1940 0.1940 Bielinska-Waz et al .…”
One novel representation of DNA sequence combining the global and local position information of the original sequence has been proposed to distinguish the different species. First, for the sufficient exploitation of global information, one graphical representation of DNA sequence has been formulated according to the curve of Fermat spiral. Then, for the consideration of local characteristics of DNA sequence, attaching each point in the curve of Fermat spiral with the related mass has been applied based on the relationships of neighboring four nucleotides. In this paper, the normalized moments of inertia of the curve of Fermat spiral which composed by the points with mass has been calculated as the numerical description of the corresponding DNA sequence on the first exons of beta-global genes. Choosing the Euclidean distance as the measurement of the numerical descriptions, the similarity between species has shown the performance of proposed method.
“…But these representational curves may degenerate, or may be not one-to-one mapping from DNA sequences. In order to overcome these defects, many new curves were introduced [11]- [19], while some new cluster methods were considered [20] [21] [22]. Some other representations were applied to the protein sequences [23] [24] [25] [26].…”
Section: Journal Of Applied Mathematics and Physicsmentioning
Background: The multiple sequence alignment (MSA) algorithms are the traditional ways to compare and analyze DNA sequences. However, for large DNA sequences, these algorithms require a long time computationally. Objective: Here we will propose a new numerical method to characterize and compare DNA sequences quickly. Method: Based on a new 2-dimensional (2D) graphical representation of DNA sequences, we can obtain an 8-dimensional vector using two basic concepts of probability, the mean and the variance. Results: We perform similarity/dissimilarity analyses among two real DNA data sets, the coding sequences of the first exon of beta-globin gene of 11 species and 31 mammalian mitochondrial genomes, respectively. Conclusion: Our results are in agreement with the existing analyses in our literatures. We also compare our approach with other methods and find that ours is more effective.
IntroductionWith the rapid growth in biological data, how to get more information from these big data is a challenge for scientists. For this purpose, an important problem is to find a suitable way to digitize these DNA sequences so that the sequence comparison can be applied. For computational time reason, beyond the traditional multiple sequence alignment (MSA), many alignment-free sequence comparison methods were introduced, for more details, please refer to [1] [2] [3] and the references therein.
“…However, these biological sequences do not increase our understanding of biology, so methods for analyzing these data are increasingly critical as the volume of biological sequence data increases. Sequence comparisons are fundamental operations in bioinformatics and many approaches have been proposed for comparing biological sequences, which can be categorized into two classes: alignment-based methods, where dynamic programming is used to evaluate all possible alignments and select the optimal solution with the highest score [1,2], and alignment-free methods, which measure the similarity between two biological sequences using statistical methods [3][4][5][6][7][8][9][10][11][12][13][14]. Some alignment-free methods deliver satisfactory performance [3][4][5][6][7][8], but they are still in the early stages of their development compared with alignmentbased methods [9][10][11][12][13][14].…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.