2018
DOI: 10.1002/cpbi.56
|View full text |Cite
|
Sign up to set email alerts
|

Using geneid to Identify Genes

Abstract: This unit describes the usage of geneid, an efficient gene‐finding program that allows for the analysis of large genomic sequences, including whole mammalian chromosomes. These sequences can be partially annotated, and geneid can be used to refine this initial annotation. Training geneid is relatively easy, and parameter configurations exist for a number of eukaryotic species. geneid produces output in a variety of standard formats. The results, thus, can be processed by a variety of software tools, including … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
92
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2
1

Relationship

2
8

Authors

Journals

citations
Cited by 129 publications
(92 citation statements)
references
References 43 publications
0
92
0
Order By: Relevance
“…Second, the complete Rosaceae proteome was downloaded from Uniprot on July 2015 and aligned to the genome using Exonerate (v.2.4.7) (Slater and Birney, ). Third, ab initio gene predictions were performed on the repeat masked pdulcis26 assembly with three different programs: GeneID v.1.4 (Alioto et al , ), Augustus v.3.2.3 (Stanke et al , ) and GeneMark‐ES v.2.3e (Lomsadze et al , ) with and without incorporating evidence from the RNA‐seq data. Finally, all the data were combined into consensus coding sequence models using EvidenceModeler‐1.1.1 (EVM) (Haas et al , ).…”
Section: Methodsmentioning
confidence: 99%
“…Second, the complete Rosaceae proteome was downloaded from Uniprot on July 2015 and aligned to the genome using Exonerate (v.2.4.7) (Slater and Birney, ). Third, ab initio gene predictions were performed on the repeat masked pdulcis26 assembly with three different programs: GeneID v.1.4 (Alioto et al , ), Augustus v.3.2.3 (Stanke et al , ) and GeneMark‐ES v.2.3e (Lomsadze et al , ) with and without incorporating evidence from the RNA‐seq data. Finally, all the data were combined into consensus coding sequence models using EvidenceModeler‐1.1.1 (EVM) (Haas et al , ).…”
Section: Methodsmentioning
confidence: 99%
“…We combined Ab initio and homology-based prediction methods to construct consensus gene models. We used PASA (PASA, RRID: SCR_014656) (Haas et al, 2003), Genscan (GENSCAN, RRID: SCR_012902) (Burge and Karlin, 1997), Augustus (Augustus: Gene Prediction, RRID: SCR 008417) (Stanke et al, 2006), GlimmerHMM (GlimmerHMM, RRID: SCR_002654) (Majoros et al, 2004), GeneID (GeneID, SCR_002473) (Alioto et al, 2018), and SNAP (Korf, 2004) to search for gene models. Using the homology-based method, protein sequences of Caenorhabditis elegans (nematode, GCA_000002985.3), Capitella teleta (Marine worm, GCA_000328365.1), and Helobdella robusta (freshwater leech, GCA_000326865.1) were downloaded from the NCBI GenBank database (GenBank, RRID: SCR_002760) (Benson et al, 2014) and implemented for library construction.…”
Section: Gene Predictionmentioning
confidence: 99%
“…Protein coding sequences (CDSs) were extracted from the reconstructed transcripts using TransDecoder (v3.0.1), a utility included with Trinity to assist with the identification of potential coding region [37]. The prediction of coding regions is based on search of all possible CDSs, verification of the predicted CDSs by GENEID [38], and selecting the region that has the highest score among candidate sequences.…”
Section: Rna Sequencing and De Novo Assemblymentioning
confidence: 99%