Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features

Iwata, Hiroaki; Gotoh, Osamu

doi:10.1093/nar/gks708

Cited by 211 publications

(168 citation statements)

References 37 publications

Supporting

Mentioning

166

Contrasting

Order By: Relevance

“…Gene models were predicted using a combined approach of homology‐based ( spaln version 2.1; Iwata & Gotoh ) and ab initio gene prediction ( augustus version 3.0.3, Stanke et al . ; glimmerhmm version 3.0.3, Kelley et al .…”

Section: Methodsmentioning

confidence: 99%

Draft genome of an iconic Red Sea reef fish, the blacktail butterflyfish (Chaetodon austriacus): current status and its characteristics

DiBattista

Wang

Saenz‐Agudelo

et al. 2016

Molecular Ecology Resources

View full text Add to dashboard Cite

Butterflyfish are among the most iconic of the coral reef fishes and represent a model system to study general questions of biogeography, evolution and population genetics. We assembled and annotated the genome sequence of the blacktail butterflyfish (Chaetodon austriacus), an Arabian region endemic species that is reliant on coral reefs for food and shelter. Using available bony fish (superclass Osteichthyes) genomes as a reference, a total of 28 926 high-quality protein-coding genes were predicted from 13 967 assembled scaffolds. The quality and completeness of the draft genome of C. austriacus suggest that it has the potential to serve as a resource for studies on the co-evolution of reef fish adaptations to the unique Red Sea environment, as well as a comparison of gene sequences between closely related congeneric species of butterflyfish distributed more broadly across the tropical Indo-Pacific.

show abstract

Section: Methodsmentioning

confidence: 99%

Draft genome of an iconic Red Sea reef fish, the blacktail butterflyfish (Chaetodon austriacus): current status and its characteristics

DiBattista

Wang

Saenz‐Agudelo

et al. 2016

Molecular Ecology Resources

View full text Add to dashboard Cite

show abstract

“…However, no attempt appears to have been made to examine whether the use of a profile can improve the quality of a spliced alignment, although an appreciable number of spliced alignment programs have been developed so far [27] including GeneWise [28] that supports profile-HMM-based spliced alignment. To test the effects of profiles, we used the same dataset, P491, as that used in previous studies [29,30].…”

Section: Resultsmentioning

confidence: 99%

Assessment and refinement of eukaryotic gene structure prediction with gene-structure-aware multiple protein sequence alignment

2014

Self Cite

View full text Add to dashboard Cite

BackgroundAccurate computational identification of eukaryotic gene organization is a long-standing problem. Despite the fundamental importance of precise annotation of genes encoded in newly sequenced genomes, the accuracy of predicted gene structures has not been critically evaluated, mostly due to the scarcity of proper assessment methods.ResultsWe present a gene-structure-aware multiple sequence alignment method for gene prediction using amino acid sequences translated from homologous genes from many genomes. The approach provides rich information concerning the reliability of each predicted gene structure. We have also devised an iterative method that attempts to improve the structures of suspiciously predicted genes based on a spliced alignment algorithm using consensus sequences or reliable homologs as templates. Application of our methods to cytochrome P450 and ribosomal proteins from 47 plant genomes indicated that 50 ~ 60 % of the annotated gene structures are likely to contain some defects. Whereas more than half of the defect-containing genes may be intrinsically broken, i.e. they are pseudogenes or gene fragments, located in unfinished sequencing areas, or corresponding to non-productive isoforms, the defects found in a majority of the remaining gene candidates can be remedied by our iterative refinement method.ConclusionsRefinement of eukaryotic gene structures mediated by gene-structure-aware multiple protein sequence alignment is a useful strategy to dramatically improve the overall prediction quality of a set of homologous genes. Our method will be applicable to various families of protein-coding genes if their domain structures are evolutionarily stable. It is also feasible to apply our method to gene families from all kingdoms of life, not just plants.

show abstract

“…For example, GLEAN [44] has, according to Web Of Science, been used to annotate 41 genomes, seven of them insect genomes (four of them published since 2013), although it appeared to have inferior accuracy in the nGASP assessment [54] and the EVM paper [9]. As another example, exonerate appears to be used more often than newer spliced alignment programs with higher accuracy [8,30]. Also, large centers may favor tools that were developed in-house.…”

Section: Discussionmentioning

confidence: 99%

“…exonerate [29], Spaln [30 ], ProSplign [31]) or a representation of a protein family Approaches to predict coding genes (dark blue). RNA-Seq can be de novo assembled to transcripts (A, e.g.…”

Section: Homology-based Approachesmentioning

confidence: 99%

“…The decrease is faster for methods that rely mostly on the alignment (exonerate, ProSplign) than for methods that use statistical models on gene structures, such as ab initio gene finders [30 ,33]. Iwata and Gotoh report in their assessment much better results for Spaln than for the two very often cited tools exonerate and GeneWise, in particular for more distant species pairs [30 ].…”

Section: Homology-based Approachesmentioning

confidence: 99%

See 1 more Smart Citation

Current methods for automated annotation of protein-coding genes

Hoff

Stanke

2015

Current Opinion in Insect Science

View full text Add to dashboard Cite

Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features

Cited by 211 publications

References 37 publications

Draft genome of an iconic Red Sea reef fish, the blacktail butterflyfish (Chaetodon austriacus): current status and its characteristics

Draft genome of an iconic Red Sea reef fish, the blacktail butterflyfish (Chaetodon austriacus): current status and its characteristics

Assessment and refinement of eukaryotic gene structure prediction with gene-structure-aware multiple protein sequence alignment

Current methods for automated annotation of protein-coding genes

Contact Info

Product

Resources

About