We introduce a 3D graphical representation of DNA sequences based on the pairs of dual nucleotides (DNs). Based on this representation, we consider some mathematical invariants and construct two 16-component vectors associated with these invariants. The vectors are used to characterize and compare the complete coding sequence part of beta globin gene of nine different species. The examination of similarities/dissimilarities illustrates the utility of the approach.
On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is introduced to characterize quantitatively DNA sequences. The components of the vector are derived from the graphical representation of DNA primary sequence. The examination of similarities and dissimilarities among the complete coding sequences of β-globin gene of 11 species and six ND6 proteins shows the utility of the scheme.
We introduce a new approach to investigate the dual nucleotides compositions of 11 Gram-positive and 12 Gram-negative eubacteria recently studied by Sorimachi and Okayasu. The approach firstly obtains a 16-dimension vector set of dual nucleotides by PN-curve from the complete genome of organism. Each vector of the set corresponds to a single gene of genome. Then we reduce the 16-dimension vector set to 2-dimension by principal components analysis (PCA). The reduction avoids possible loss of information averaging all 16-dimension vectors. Then we suggest a 2D graphical representation based on the 2-dimension vector to investigate the classification patters among different organisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.