2001
DOI: 10.1021/ci0000981
|View full text |Cite
|
Sign up to set email alerts
|

Characterization of DNA Primary Sequences Based on the Average Distances between Bases

Abstract: We outline numerical characterization of DNA primary sequence based on calculation of the average distance between pairs of nucleic acid bases. This leads to a representation of DNA by a condensed 4 x 4 symmetrical matrix, the elements of which give the average separation between pair of bases X, Y in DNA (X, Y = A, C, G, T). As an invariant of choice we consider the leading eigenvalue of the derived 4 x 4 matrix. Additional structurally related invariants were obtained by constructing additional "higher order… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
30
0

Year Published

2001
2001
2013
2013

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 43 publications
(30 citation statements)
references
References 53 publications
0
30
0
Order By: Relevance
“…1. The similarity between pairs of candidates pair of rep. set similarity (3,5) 0.677 (1,2) 0.674 (4,8) 0.659 (3,4) 0.655 (4,5) 0.655 (6,7) 0.653 (1,8) 0.652 (6,8) 0.651…”
Section: Hierarchical Clustering Based On Representative Setsmentioning
confidence: 99%
See 1 more Smart Citation
“…1. The similarity between pairs of candidates pair of rep. set similarity (3,5) 0.677 (1,2) 0.674 (4,8) 0.659 (3,4) 0.655 (4,5) 0.655 (6,7) 0.653 (1,8) 0.652 (6,8) 0.651…”
Section: Hierarchical Clustering Based On Representative Setsmentioning
confidence: 99%
“…The distance measure requires a geometrical representation of an object. This, however, is far from being unique in the case of a symbolic sequence [1,2]. There are several definitions of similarity as well [3][4][5].…”
Section: Introductionmentioning
confidence: 99%
“…One can view this sequence to define a partition of a line, the entries of the sequence giving the length of individual segments on the line. The line distance matrix, which has only recently received some attention [69][70][71], is Table 2 The upper triangular part of the augmented distance matrix for terminal vertices of the star graph of Fig. 2 1 3 8 10 12 15 16 17 19 20 21 Row sum 1 1 2 2 3 3 3 3 3 3 5 3 31 3 1 2 3 3 3 3 3 3 5 3 31 8 1 3 3 3 3 3 3 5 3 31 10 2 Table 3 The eigenvalues of the distance matrix for terminal vertices (TD) of the graph of Fig.…”
Section: Line Distance Matrixmentioning
confidence: 99%
“…At the same time, the number of DNA and protein QSAR studies is increasing (Agrawal et al, 2005;Arteca and Tapia, 1999;Hua and Sun, 2001;Randic and Balaban, 2003) by the creation of new macromolecular descriptors named topological indices (TIs) using graph theory. The branch of mathematical chemistry dedicated to encode the DNA/protein information in graph representations by the use of TIs has become an intense research area with interesting works of Liao (Liao and Ding, 2005;Liao and Wang, 2004a, b;Liao et al, 2006), Randic, Nandy, Balaban, Basak and Vracko (Randic and Balaban, 2003;Randic, 2000;Randic and Basak, 2001;Randic et al, 2000) or our group (Aguero-Chapin et al, 2006). In addition, the computational approaches and theoretical analyses, such as structural bioinformatics (Chou, 2004a, b), network approach (Chou et al, 2006a, b;Chou and Cai, 2006;Gonzá lez-Díaz et al, 2008), molecular docking Gao et al, 2007;Li et al, 2007;Zhang et al, 2006;Wang et al, 2008;Zheng et al, 2007), pharmacophore modeling (Chou et al, 2006a, b;Sirois et al, 2004), protein cleavage site prediction (Chou, 1993(Chou, , 1996Du et al, 2005a, b), QSAR (Du et al, 2005a(Du et al, , b, 2008Gonzalez-Diaz et al, 2006a, b, c) and graphical operations (Althaus et al, 1993a, b;Andraos, 2008;Chou, 1989Chou, , 1990Chou et al, 1994;Gonzá lez-Díaz et al, 2008), are providing very useful information and insights for drug design during the course of drug development.…”
Section: Introductionmentioning
confidence: 99%