2003
DOI: 10.1093/bioinformatics/btf843
|View full text |Cite
|
Sign up to set email alerts
|

FORRepeats: detects repeats on entire chromosomes and between genomes

Abstract: We present a heuristic method, named FORRepeats, which is based on a novel data structure called factor oracle. In the first step it detects exact repeats in large sequences. Then, in the second step, it computes approximate repeats and performs pairwise comparison. We compared its computational characteristics with BLAST and REPuter. Results demonstrate that it is fast and space economical. We show FORRepeats ability to perform intra-genomic comparison and to detect repeated DNA sequences in the complete geno… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
35
0

Year Published

2003
2003
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 57 publications
(35 citation statements)
references
References 20 publications
0
35
0
Order By: Relevance
“…In recent decades, growing quantities of DNA and protein sequence data have been and are being generated, increasing the need for compact and efficient representations to store and search such data [1,2]. Similar developments can be found for natural language processing as well.…”
Section: Introductionmentioning
confidence: 81%
See 2 more Smart Citations
“…In recent decades, growing quantities of DNA and protein sequence data have been and are being generated, increasing the need for compact and efficient representations to store and search such data [1,2]. Similar developments can be found for natural language processing as well.…”
Section: Introductionmentioning
confidence: 81%
“…Firstly, factors that occur multiple times within a given DNA sequence may indicate interesting genetic phenomena such as tandem repeats and micro-satellites [3]. To detect such factors, an index representing all the factors of a given sequence can be constructed in such a way that it provides information about repeated factors, including where they are located in the given sequence [1,4].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Several algorithmic methods have been proposed for efficiently detecting tandem or dispersed repeats in DNA sequences; see [3], [4], [5], [6], and [7]. But, a relevant statistical analysis of these repeats (count, length, location) should then be done in order to distinguish the significant repeats from those that can be just obtained 'by chance'.…”
mentioning
confidence: 99%
“…The study of approximate repeats is important in functional genomics and several bioinformatics tools are available exclusively focused on this subject [17,18]. The SimSearch's similarity matrices also record the pattern interruptions.…”
Section: Approximate Repetitionsmentioning
confidence: 99%