2007
DOI: 10.1109/tcbb.2007.1004
|View full text |Cite
|
Sign up to set email alerts
|

Ortholog Clustering on a Multipartite Graph

Abstract: We present a method for automatically extracting groups of orthologous genes from a large set of genomes by a new clustering algorithm on a weighted multipartite graph. The method assigns a score to an arbitrary subset of genes from multiple genomes to assess the orthologous relationships between genes in the subset. This score is computed using sequence similarities between the member genes and the phylogenetic relationship between the corresponding genomes. An ortholog cluster is found as the subset with the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2007
2007
2019
2019

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 30 publications
0
11
0
Order By: Relevance
“…This is because the algorithm must: (i) iterate over all edges e ( u , v ) in C (Step 3), with the worst-case complexity O ( m ) = O ( g 2 ); and for each, (ii) look for a vertex w and edge f ( u , w ) in G (Step 4), which is at worst O ( g ) if it must look through all other genomes in the g -partite graph; and finally for each of these, (iii) check whether u and w are adjacent in G , which is an efficient O (log g ) lookup from the list of all adjacent vertices of w (or v ). The worst-case complexity of EdgeSearch is comparable to the O (V 3 ) (V = number of vertices) of another heuristic method described in Vashist et al (2007), but uses different topological information, i.e. triangles in a SymBets graph rather than dense clusters (quasi-cliques) in a graph that may include all edges and does not require a species tree.…”
Section: Resultsmentioning
confidence: 97%
See 1 more Smart Citation
“…This is because the algorithm must: (i) iterate over all edges e ( u , v ) in C (Step 3), with the worst-case complexity O ( m ) = O ( g 2 ); and for each, (ii) look for a vertex w and edge f ( u , w ) in G (Step 4), which is at worst O ( g ) if it must look through all other genomes in the g -partite graph; and finally for each of these, (iii) check whether u and w are adjacent in G , which is an efficient O (log g ) lookup from the list of all adjacent vertices of w (or v ). The worst-case complexity of EdgeSearch is comparable to the O (V 3 ) (V = number of vertices) of another heuristic method described in Vashist et al (2007), but uses different topological information, i.e. triangles in a SymBets graph rather than dense clusters (quasi-cliques) in a graph that may include all edges and does not require a species tree.…”
Section: Resultsmentioning
confidence: 97%
“…Examples of automated implementations of the former approach include the publicly available algorithms EnsemblCompara (Vilella et al , 2009), SYNERGY (Wapinski et al , 2007), RIO (Zmasek and Eddy, 2002), Orthostrapper (Storm and Sonnhammer, 2002) and the databases of orthologous protein families HOBACGEN, HOVERGEN and HOGENOME (Dufayard et al , 2005), whereas examples of the latter include OrthoMCL (Li et al , 2003), eggNOG (Jensen et al , 2008), InParanoid and MultiParanoid (Alexeyenko et al , 2006; O'Brien et al , 2005; Remm et al , 2001), MSOAR and MultiMSOAR (Fu and Jiang, 2007; Fu et al , 2007), Homologene (Sayers et al , 2010), RoundUp (Deluca et al , 2006) and OMA (Roth et al , 2008). Still other methods exist that do not fall neatly into either category, such as that described in (Vashist et al , 2007), which uses topological distance in a species tree as a factor in a linkage equation to find dense clusters in a multipartite graph (whose edges are not restricted to SymBets).
Fig.
…”
Section: Introductionmentioning
confidence: 99%
“…This combinatorial optimization problem has been studied in [20] and it has been shown that an efficient algorithm exists for finding the global optimal solution H * if the linkage function π(i, H ) is monotone increasing. The monotone increasing property requires that the value of the linkage function for the vertex i can only increase when the second argument H increases in a set theoretic sense, i.e.…”
Section: Combinatorial Selection Of Characteristic Image Patchesmentioning
confidence: 99%
“…The algorithm for solving this combinatorial optimization problem is given [20], and is described in the pseudocode form in Algorithm 3.1. This iterative algorithm begins by calculating F (V + ) and finds the set M 1 containing the set of vertices from V + which have the minimum value of the linkage function, i.e.…”
Section: Combinatorial Selection Of Characteristic Image Patchesmentioning
confidence: 99%
See 1 more Smart Citation