2009
DOI: 10.1093/bioinformatics/btp546
|View full text |Cite
|
Sign up to set email alerts
|

Genome analysis with inter-nucleotide distances

Abstract: Motivation: DNA sequences can be represented by sequences of four symbols, but it is often useful to convert the symbols into real or complex numbers for further analysis. Several mapping schemes have been used in the past, but they seem unrelated to any intrinsic characteristic of DNA. The objective of this work was to find a mapping scheme directly related to DNA characteristics and that would be useful in discriminating between different species. Mathematical models to explore DNA correlation structures may… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
68
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 76 publications
(70 citation statements)
references
References 21 publications
2
68
0
Order By: Relevance
“…We also used the IN distance relative error used in (Afreixo et al, 2009) and we concatenated the first 100 IN distances and the first 20 ND distance relative errors. Only the first 20 ND distances were used, because for larger distances the distributions become sparse.…”
Section: Numerical Proceduresmentioning
confidence: 99%
See 1 more Smart Citation
“…We also used the IN distance relative error used in (Afreixo et al, 2009) and we concatenated the first 100 IN distances and the first 20 ND distance relative errors. Only the first 20 ND distances were used, because for larger distances the distributions become sparse.…”
Section: Numerical Proceduresmentioning
confidence: 99%
“…However, several other different mappings have been proposed (see for example, Silverman and Linsker (1986); Jeffrey (1990); Zhang and Zhang (1994); Buldyrev et al (1995); Anastassiou (2001) In a previous work, we explored the inter-nucleotide (IN) distance, the distance to the first occurrence of the same symbol, to perform a comparative analysis between species (Afreixo et al, 2009). In this work, we present a new DNA numerical profile and a new mapping to explore the correlation structure of DNA: the distance to the nearest dissimilar (ND) nucleotide.…”
Section: Introductionmentioning
confidence: 99%
“…In a simple way, DNA sequences are non-numerical sequences of the four-letter alphabet, A, C, G and T , which stands for the four nucleotides: Adenine, Cytosine, Guanine, and Thymine. Various transformations of DNA sequences into numerical data have been proposed in order to take advantage of methodologies available for quantitative data ( [2,9,15,16,3,18,1]). Free-alignment algorithms have been applied to build distance trees aimed at visualizing historical evolutionary relationships among species (e.g., [21,23]).…”
Section: Introductionmentioning
confidence: 99%
“…Several numerical transformations of DNA sequences have been used to perform multiple organism comparisons. Basically, observed DNA sequences and randomly ordered sequences (random background) are compared using different procedures and discrepancy measures based either on genomic symbol frequencies (e.g., [22,14,13,20]) or on genomic symbol distance frequencies (e.g., [1]). This type of residual analysis can highlight the contribution of DNA selective evolution of each species ( [17]).…”
Section: Introductionmentioning
confidence: 99%
“…a study of correlation information, sequence periodicities, and other sequence characteristics. More examples of studies focused on statistical properties of DNA sequences and also on their biological interpretation may be found in [12,13].…”
Section: Introductionmentioning
confidence: 99%