Evolutionary Biology—A Transdisciplinary Approach 2020
DOI: 10.1007/978-3-030-57246-4_9
|View full text |Cite
|
Sign up to set email alerts
|

Orthology: Promises and Challenges

Abstract: Orthology is a cornerstone of comparative genomics and has numerous applications in current biology. In this chapter, we first introduce the concepts of orthology and paralogy. We then present the currently available orthology inference methods and the community-led efforts of standardization and benchmarking accompanying these developments. The large panel of available orthology resources is compared in terms of species coverage, access, contextual data and tools proposed to end-users to facilitate the analys… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 11 publications
(15 citation statements)
references
References 135 publications
0
15
0
Order By: Relevance
“…In Table 1 , we provide a summary of the difference between the methods included in the 2020 public benchmark, their main algorithmic category, the type of orthologous relations they produce, and what is their usual performance in the benchmark. A more detailed comparison of orthology inference methods and resources is available in ( 2 , 3 ). The relationship between type of output and performance is clear for the few methods and algorithms which only aim to provide one-to-one orthology inference (OMA Groups ( 17 ), PANTHER LDO ( 18 ), BBH ( 13 ) and RSD ( 14 )): they constantly rank as the method with the best accuracy at the expense of recall.…”
Section: Results For Public Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In Table 1 , we provide a summary of the difference between the methods included in the 2020 public benchmark, their main algorithmic category, the type of orthologous relations they produce, and what is their usual performance in the benchmark. A more detailed comparison of orthology inference methods and resources is available in ( 2 , 3 ). The relationship between type of output and performance is clear for the few methods and algorithms which only aim to provide one-to-one orthology inference (OMA Groups ( 17 ), PANTHER LDO ( 18 ), BBH ( 13 ) and RSD ( 14 )): they constantly rank as the method with the best accuracy at the expense of recall.…”
Section: Results For Public Methodsmentioning
confidence: 99%
“…Orthology inference—the process of identifying genes originated from the same common ancestor through a speciation event ( 1 )—is a cornerstone of comparative genomics and phylogenetics ( 2 ). Inferring orthology is often a difficult process, since the evolutionary histories of gene families may involve multiple duplications, losses, and horizontal transfers of entire genes or individual domains, and because it aims to recapitulate events that took place millions of years ago, under unknown selective pressures, using only the information available from the genomes of modern day species ( 3 ).…”
Section: Introductionmentioning
confidence: 99%
“…For instance, the recent reannotation of the Daphnia pulex genome [34] showed that the high number of small proteins in the previous genome is likely spurious. These errors are possibily due to fragmented assembly leading to genome annotation errors [35,36] or by the notorious difficulty to discriminate coding and non-coding ORF [37,38] The overabundance of spurious protein is likely to bias all downstream analyses involving them, leading to an inflated genome size [34,39], an inflated number of orphan genes [40], and errors in orthology inference (See the 'Addressing Proteome Quality' section in [41]) but the proportion of small proteins is generally ignored when providing a new annotation set. We propose that the distribution of protein length be used as a new criterion of protein-coding gene quality upon publication, to complement existing quality measures.…”
Section: Discussionmentioning
confidence: 99%
“…3 B), indicating a better quality of our new predictions. In this regard, Nevers et al, (2020) reports that a non-normal distribution of proteins length, as observed for these Prodigal predictions in eukaryotic contigs, is indicative of more truncated proteins caused by fragmented genomes and incorrect protein prediction[30].…”
Section: Data Validation and Quality Controlmentioning
confidence: 99%