2012
DOI: 10.1093/nar/gks1239
|View full text |Cite
|
Sign up to set email alerts
|

KEGG OC: a large-scale automatic construction of taxonomy-based ortholog clusters

Abstract: The identification of orthologous genes in an increasing number of fully sequenced genomes is a challenging issue in recent genome science. Here we present KEGG OC (http://www.genome.jp/tools/oc/), a novel database of ortholog clusters (OCs). The current version of KEGG OC contains 1 176 030 OCs, obtained by clustering 8 357 175 genes in 2112 complete genomes (153 eukaryotes, 1830 bacteria and 129 archaea). The OCs were constructed by applying the quasi-clique-based clustering method to all possible protein co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
72
0
1

Year Published

2014
2014
2019
2019

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 107 publications
(73 citation statements)
references
References 22 publications
0
72
0
1
Order By: Relevance
“…The analysis of archaeal protein families relied on the assignments in the latest public release of the COG database (Tatusov et al , 2003), currently available at ftp://ftp.ncbi.nlm.nih.gov/pub/COG/COG/whog, the NCBI’s RefSeq database (Pruitt et al , 2012), and the recently updated version of archaeal COGs (Wolf et al , 2012), which is available at ftp://ftp.ncbi.nih.gov/pub/wolf/COGs/arCOG. We also used the latest versions of the eggNOG, KEGG and MetaCyc databases (Caspi et al , 2012; Powell et al , 2012; Nakaya et al , 2013). Pathway details were taken from KEGG and the available literature data.…”
Section: Cog-based Pathway Analysismentioning
confidence: 99%
“…The analysis of archaeal protein families relied on the assignments in the latest public release of the COG database (Tatusov et al , 2003), currently available at ftp://ftp.ncbi.nlm.nih.gov/pub/COG/COG/whog, the NCBI’s RefSeq database (Pruitt et al , 2012), and the recently updated version of archaeal COGs (Wolf et al , 2012), which is available at ftp://ftp.ncbi.nih.gov/pub/wolf/COGs/arCOG. We also used the latest versions of the eggNOG, KEGG and MetaCyc databases (Caspi et al , 2012; Powell et al , 2012; Nakaya et al , 2013). Pathway details were taken from KEGG and the available literature data.…”
Section: Cog-based Pathway Analysismentioning
confidence: 99%
“…To analyze the evolutionary conservation, we selected model organisms with rich genome information, i.e., Saccharomyces cerevisiae (fission yeast), Schizosaccharomyces pombe (budding yeast), Caenorhabditis elegans (worm), Drosophila melanogaster (fly), Danio rerio (fish), Canis familiaris (dog), Mus musculus (mouse), Pan troglodytes (Chimpanzee), and Homo sapiens (human). We obtained orthologous gene sets from KEGG OC [29] for the nine species. We constructed multiple sequence alignments of phosphoproteins in each orthologous group and identified species where a known phosphosite was conserved in a motif.…”
Section: Resultsmentioning
confidence: 99%
“…The three-letter code for each genome is the species identifier defined by KEGG. Orthologous genes among these genomes were defined by KEGG OC [29]. For each ortholog cluster, multiple sequence alignments were constructed by MAFFT [53], which is a freely available, rapid, and reliable tool compared with other alignment tools.…”
Section: Methodsmentioning
confidence: 99%
“…not AraCyc, are computationally predicted (Zhang et al, 2010;Nakaya et al, 2013;Kanehisa et al, 2014;Seaver et al, 2014). Additionally, in many of the cases, and this problem is particularly acute in plants, the set of computationally predicted genes associated with reactions may be homologous, but do not perform the same catalytic function (i.e., they are out-paralogs).…”
Section: Evidence For Gene-reaction Associationsmentioning
confidence: 99%