2014
DOI: 10.1093/bioinformatics/btu331
|View full text |Cite
|
Sign up to set email alerts
|

kmacs: the k -mismatch average common substring approach to alignment-free sequence comparison

Abstract: Motivation: Alignment-based methods for sequence analysis have various limitations if large datasets are to be analysed. Therefore, alignment-free approaches have become popular in recent years. One of the best known alignment-free methods is the average common substring approach that defines a distance measure on sequences based on the average length of longest common words between them. Herein, we generalize this approach by considering longest common substrings with k mismatches. We present a greedy heurist… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

2
148
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 124 publications
(150 citation statements)
references
References 32 publications
2
148
0
Order By: Relevance
“…Heyne et al (2012) mapped RNA sequence structure information into a graph and clustered the RNAs by means of a graph kernel. Leimeister and Morgenstern (2014) compared DNA sequences based on the longest common substrings with k mismatches. Eisen (1998), Yi et al (2007), and Hatfull et al (2010) clustered genomes by various types of statistical information from DNA microarray data.…”
Section: Related Workmentioning
confidence: 99%
“…Heyne et al (2012) mapped RNA sequence structure information into a graph and clustered the RNAs by means of a graph kernel. Leimeister and Morgenstern (2014) compared DNA sequences based on the longest common substrings with k mismatches. Eisen (1998), Yi et al (2007), and Hatfull et al (2010) clustered genomes by various types of statistical information from DNA microarray data.…”
Section: Related Workmentioning
confidence: 99%
“…This value is used to estimate pairwise distance of X and Y, in an alignment-free manner, for phylogenetic tree reconstruction (Leimeister and Morgenstern 2014;Horwege et al 2014) Horwege, Lindner, Boden, Hatje, Kollmar, Leimeister, and Morgenstern]. The distance Dist k (X‚ Y), computed as follows, is considered as a good estimate of the evolutionary distance between two species whose corresponding sequences are X and Y, respectively.…”
Section: Introductionmentioning
confidence: 99%
“…The previous solution to this problem also took near quadratic time (Leimeister and Morgenstern, 2014;Apostolico et al, 2014)Apostolico, Guerra, and Pizzi]. To circumvent this quadratic time barrier, some fast application-specific heuristics (Leimeister and Morgenstern 2014) have already been proposed with near linear time complexity with respect to n and k. Although not as immediate as the above examples, the AAT algorithm is capable of solving another general problem.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations