2020
DOI: 10.1371/journal.pcbi.1007553
|View full text |Cite
|
Sign up to set email alerts
|

Scalable phylogenetic profiling using MinHash uncovers likely eukaryotic sexual reproduction genes

Abstract: Phylogenetic profiling is a computational method to predict genes involved in the same biological process by identifying protein families which tend to be jointly lost or retained across the tree of life. Phylogenetic profiling has customarily been more widely used with prokaryotes than eukaryotes, because the method is thought to require many diverse genomes. There are now many eukaryotic genomes available, but these are considerably larger, and typical phylogenetic profiling methods require at least quadrati… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
23
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 20 publications
(26 citation statements)
references
References 70 publications
(100 reference statements)
2
23
1
Order By: Relevance
“…Clusters of coevolving genes were first determined using the weighted minhash, using a low threshold (0.3) for similarity ( 33 ). Because of the low threshold and probabilistic nature of the minhash distance and resulting clustering, we then broke up clusters identified using the weighted minhash, as these clusters may contain illegitimately clustered vectors.…”
Section: Methodsmentioning
confidence: 99%
“…Clusters of coevolving genes were first determined using the weighted minhash, using a low threshold (0.3) for similarity ( 33 ). Because of the low threshold and probabilistic nature of the minhash distance and resulting clustering, we then broke up clusters identified using the weighted minhash, as these clusters may contain illegitimately clustered vectors.…”
Section: Methodsmentioning
confidence: 99%
“…Thus, such co-evolution patterns can be used to infer functionally related genes—a technique known as ‘phylogenetic profiling’ ( 28 ). Recently, we introduced HogProf, an algorithm to efficiently identify similar HOGs in terms of their presence or absence at each extant and ancestral node in the genome taxonomy, as well as the duplication or loss events on the branch leading to that node ( 29 ). This functionality has been added to the OMA browser, making it possible, starting from any HOG, to identify similar HOGs using similar phylogenetic patterns.…”
Section: Phylogenetic Profilingmentioning
confidence: 99%
“…Of note, the visual representation of the profile available on the web interface only shows the extant species covered by the query and returned HOGs. The actual profile similarities are calculated between the set of taxonomic nodes where ancestral presence was inferred along with extant species, as well as the set of ancestral duplications and losses shared between HOGs ( 29 ). Finally, phylogenetic profiles are also retrievable via the REST programmatic interface (under HOGs methods).…”
Section: Phylogenetic Profilingmentioning
confidence: 99%
“…Phylogenetic profiling is one such analysis method. In this method, when two ortholog groups (OGs) have similar occurrence patterns among species in a table of OGs, the two OGs are presumed to be functionally related ( Kensche et al , 2008 ; Moi et al , 2020 ; Niu et al , 2017 ; Pellegrini et al , 1999 ; Stupp et al , 2021 ; Tremblay et al , 2021 ; Tsaban et al , 2021 ). Although phylogenetic profiling was first proposed to detect protein–protein interactions, this method in principle captures any functional relationships between genes.…”
Section: Introductionmentioning
confidence: 99%
“…In other words, the conventional calculation of similarity introduces an evolutionary bias in the estimation. Therefore, methods that consider a phylogenetic tree were proposed and showed good performance ( Barker et al , 2007 ; Cohen et al , 2012 ; Moi et al , 2020 ; Ta et al , 2011 ; Vert, 2002 ).…”
Section: Introductionmentioning
confidence: 99%