2006
DOI: 10.1142/s021972000600203x
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Algorithms and Software for Detection of Full-Length LTR Retrotransposons

Abstract: LTR retrotransposons constitute one of the most abundant classes of repetitive elements in eukaryotic genomes. In this paper, we present a new algorithm for detection of full-length LTR retrotransposons in genomic sequences. The algorithm identifies regions in a genomic sequence that show structural characteristics of LTR retrotransposons. Three key components distinguish our algorithm from that of current software--(i) a novel method that preprocesses the entire genomic sequence in linear time and produces hi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
25
0

Year Published

2007
2007
2022
2022

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(25 citation statements)
references
References 35 publications
0
25
0
Order By: Relevance
“…This provides correct identification of fragmented TEs in nested repeat clusters, but reconstruction of whole TEs and evolutionary timeline of insertions is not possible. LTR retrotransposon detection software, such as LTR_struct (McCarthy and McDonald, 2003;Kalyanaraman and Aluru, 2005), groups LTR pairs based on sequence alignment identity. With LTR pair locations, one can infer a general retrotransposon insertion order; however, nested repeats are not specifically addressed and an LTR broken from subsequent insertions will not be identified.…”
mentioning
confidence: 99%
“…This provides correct identification of fragmented TEs in nested repeat clusters, but reconstruction of whole TEs and evolutionary timeline of insertions is not possible. LTR retrotransposon detection software, such as LTR_struct (McCarthy and McDonald, 2003;Kalyanaraman and Aluru, 2005), groups LTR pairs based on sequence alignment identity. With LTR pair locations, one can infer a general retrotransposon insertion order; however, nested repeats are not specifically addressed and an LTR broken from subsequent insertions will not be identified.…”
mentioning
confidence: 99%
“…As evaluated in Ref. [4], LTR harvest and LTR-FINDER+LTR harvest generated a large number of false-positive LTRs which required further removal through annotations. Most false positives resulted from duplicated genes and, as such, possessed no significant internal protein domains.…”
Section: Resultsmentioning
confidence: 99%
“…Several programs have been released for identifying full-length or intact LTRs, such as LTR_STRUCT [3], LTR_PAR [4], FIND_LTR [5], LTR_FINDER [6] and LTR harvest [7]. These tools take into account several major characteristics of LTRs such as the size range of intact LTRs, the distances between two LTRs of intact elements, the presence of target site duplications (TSDs) at each terminal region, the presence of critical sites for reversing transcribing elements for transposition such as the primer binding site (PBS) and the poly purine tract (PPT), and the identity percentage between two LTRs.…”
Section: Introductionmentioning
confidence: 99%
“…This is achieved by deploying a strategy of first identifying pairs of exact (maximal, to be precise) matching substrings as "seeds" and extending the seeds outwards through sequence alignment 12 . The rationale is that a substantially long (M inExactM atch) exact match is a necessary but not sufficient indicator for a satisfactory alignment (M inSimilarity) -thus, generating pairs of loci with long exact matching pairs provides a good filter to predict potential aligning regions.…”
Section: Phase 1: Candidate Pattern Identificationmentioning
confidence: 99%
“…Substantial research over the last decade has led to the development of several excellent repeat identification methods and software tools 2,9,12,14,18,20,24 . While these methods differ from one another in their underlying algorithms and approaches, most of them share the following set of characteristics in their general approach towards repeat identification: (i) detection based on sequence similarity, (ii) targeting specific types of repeats, and (iii) assuming that the set of structural attributes that characterize each of their target repeat classes is known a priori to the user so that they can be provided as part of the input.…”
Section: Introductionmentioning
confidence: 99%