2018
DOI: 10.7717/peerj.4264
|View full text |Cite
|
Sign up to set email alerts
|

Genomic signal processing for DNA sequence clustering

Abstract: Genomic signal processing (GSP) methods which convert DNA data to numerical values have recently been proposed, which would offer the opportunity of employing existing digital signal processing methods for genomic data. One of the most used methods for exploring data is cluster analysis which refers to the unsupervised classification of patterns in data. In this paper, we propose a novel approach for performing cluster analysis of DNA sequences that is based on the use of GSP methods and the K-means algorithm.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
29
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 30 publications
(31 citation statements)
references
References 45 publications
0
29
0
2
Order By: Relevance
“…He chose Euclidean distance as the similarity measure to be adopted by the K-means algorithm. This method can be used to evaluate the ability of markers or genes to distinguish organisms at different levels, identify subgroups in a group of organisms, and classify fragments of DNA sequences based on known sequences (Mendizabal-Ruiz et al, 2018). Mendizabal-Ruiz G has demonstrated that it is possible to group DNA sequences based on their frequency components.…”
Section: Dna Sequence Clusteringmentioning
confidence: 99%
“…He chose Euclidean distance as the similarity measure to be adopted by the K-means algorithm. This method can be used to evaluate the ability of markers or genes to distinguish organisms at different levels, identify subgroups in a group of organisms, and classify fragments of DNA sequences based on known sequences (Mendizabal-Ruiz et al, 2018). Mendizabal-Ruiz G has demonstrated that it is possible to group DNA sequences based on their frequency components.…”
Section: Dna Sequence Clusteringmentioning
confidence: 99%
“…In this work, a similar algorithm is implemented to analyse nucleotide sequences: each nucleotide position in a sequence is represented as a four elements vector, the Voss representation [24], encoding the probability of each base according to previously aligned reads. This numerical representation of DNA sequence is appropriate for the comparison of DNA sequences [25] and their classification[26]. In molecular biology, a similar algorithm has been applied to the clustering of amino acid sequences [27] where vector quantization is used to estimate the probability density of amino acids.…”
Section: Methodsmentioning
confidence: 99%
“…The results showed that the FTIR sampling techniques had a significant influence on the spectral characteristics, spectral quality, and sampling efficiency. Ruiz et al [32] proposed a novel approach for performing cluster analysis of DNA sequences that is based on the use of Genomic signal processing GSP methods and the K-means algorithm. We also propose a visualization method that facilitates the easy inspection and analysis of the results and possible hidden behaviors.…”
Section: Dna Sequencementioning
confidence: 99%