2007
DOI: 10.1007/s00224-007-9078-6
|View full text |Cite
|
Sign up to set email alerts
|

A New Combinatorial Approach to Sequence Comparison

Abstract: In this paper we introduce a new alignment-free method for comparing sequences which is combinatorial by nature and does not use any compressor nor any information-theoretic notion. Such a method is based on an extension of the Burrows-Wheeler Transform, a transformation widely used in the context of Data Compression. The new extended transformation takes as input a multiset of sequences and produces as output a string obtained by a suitable rearrangement of the characters of all the input sequences. By using … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
23
0

Year Published

2007
2007
2019
2019

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(23 citation statements)
references
References 21 publications
0
23
0
Order By: Relevance
“…In stark contrast to the plethora of studies investigating the accuracy of alignment-based tree reconstruction, surprisingly little is known about the accuracy of alignment-free methods, due to an almost complete absence of systematic and comprehensive large-scale studies from this field. In the context of phylogenetics, studies that introduce a new method have usually characterized its accuracy by comparing at most a handful reconstructed trees to "standard" trees derived from alignments, focusing on the clustering of subgroups and the placement of taxa instead of emphasizing numerical results (even though studies may otherwise be large-scale: Li et al, 2001;Otu and Sayood, 2003;Stuart et al, 2002aStuart et al, , 2002bBerry, 2003,2004;Qi et al, 2004;Chu et al, 2004;Hao and Qi, 2004;Yu and Anh, 2004;Yang et al, 2005;Mantaci et al, 2005). This makes it difficult to extract useful generalizations from this literature, especially considering that data sets vary from paper to paper.…”
mentioning
confidence: 99%
“…In stark contrast to the plethora of studies investigating the accuracy of alignment-based tree reconstruction, surprisingly little is known about the accuracy of alignment-free methods, due to an almost complete absence of systematic and comprehensive large-scale studies from this field. In the context of phylogenetics, studies that introduce a new method have usually characterized its accuracy by comparing at most a handful reconstructed trees to "standard" trees derived from alignments, focusing on the clustering of subgroups and the placement of taxa instead of emphasizing numerical results (even though studies may otherwise be large-scale: Li et al, 2001;Otu and Sayood, 2003;Stuart et al, 2002aStuart et al, , 2002bBerry, 2003,2004;Qi et al, 2004;Chu et al, 2004;Hao and Qi, 2004;Yu and Anh, 2004;Yang et al, 2005;Mantaci et al, 2005). This makes it difficult to extract useful generalizations from this literature, especially considering that data sets vary from paper to paper.…”
mentioning
confidence: 99%
“…A class of similarity measures was defined by Mantaci et al [17] over an extension of the Burrows-Wheeler transform for string collections, called eBWT [16]. Later, Yang et al [34,35] recrafted the method by Mantaci et al and introduced the Burrows-Wheeler similarity distribution (BWSD) of two strings S 1 and S 2 based on the BWT of their concatenation.…”
Section: Introductionmentioning
confidence: 99%
“…There are also some other important methods such as Lemple-Ziv (LZ) complexity, Burrows-Wheeler (BW) transform [1,[23][24][25][26][27][28] which are based on compression algorithm, but do not actually apply the compression. The Burrows-Wheeler Transform (BWT) was introduced by Burrows and Wheeler in 1994, and is recently studied also from a combinatorial point of view [24][25][26][29][30][31][32]. Loosely speaking, BWT can map any finite string (word) over an ordered alphabet to another one which can be compressed easier.…”
Section: Introductionmentioning
confidence: 99%
“…Loosely speaking, BWT can map any finite string (word) over an ordered alphabet to another one which can be compressed easier. To compare the similarity of two sequences, Mantaci et al [25,30] introduced an extension of the Burrows-Wheeler Transform (EBWT) and defined a class of dissimilarity measures. While Yang et al [1] used a Burrows-Wheeler similarity distribution (BWSD) based on Burrows-Wheeler transform to express the similarity between two protein sequences.…”
Section: Introductionmentioning
confidence: 99%