2021
DOI: 10.1093/nar/gkab395
|View full text |Cite
|
Sign up to set email alerts
|

MyCLADE: a multi-source domain annotation server for sequence functional exploration

Abstract: The ever-increasing number of genomic and metagenomic sequences accumulating in our databases requires accurate approaches to explore their content against specific domain targets. MyCLADE is a user-friendly webserver designed for targeted functional profiling of genomic and metagenomic sequences based on a database of a few million probabilistic models of Pfam domains. It uses the MetaCLADE multi-source domain annotation strategy, modelling domains based on multiple probabilistic profiles. MyCLADE takes a lis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 33 publications
0
4
0
Order By: Relevance
“…log2FC of each OG compared to average expression in cell > 2 and if no hits were recovered the threshold was lowered to 1, Fig. S6) were described by 5 levels of annotation (Table S3): general annotation with NCBI blastp (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome, default parameters), protein family prediction with InterProscan (v98.0, https://www.ebi.ac.uk/interpro/) (Paysan-Lafosse et al, 2023), functional PFAM domain identification with myCLADE (http://www.lcqb.upmc.fr/myclade/large_annotation.php, complete model library, e-value < 1E-06) (Vicedomini et al, 2021) and structural homology identification Phyre (v2.0, http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index) (Kelley et al, 2015). OGs with consensus annotation relevant to the life cycle were selected as potential acantharian life cycle markers.…”
Section: Methodsmentioning
confidence: 99%
“…log2FC of each OG compared to average expression in cell > 2 and if no hits were recovered the threshold was lowered to 1, Fig. S6) were described by 5 levels of annotation (Table S3): general annotation with NCBI blastp (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome, default parameters), protein family prediction with InterProscan (v98.0, https://www.ebi.ac.uk/interpro/) (Paysan-Lafosse et al, 2023), functional PFAM domain identification with myCLADE (http://www.lcqb.upmc.fr/myclade/large_annotation.php, complete model library, e-value < 1E-06) (Vicedomini et al, 2021) and structural homology identification Phyre (v2.0, http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index) (Kelley et al, 2015). OGs with consensus annotation relevant to the life cycle were selected as potential acantharian life cycle markers.…”
Section: Methodsmentioning
confidence: 99%
“…Identifying and characterizing the complete molecular systems involved in such processes requires functional knowledge of the proteins involved. Many studies [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18] have demonstrated the power of phylogenetic and molecular evolutionary analysis in determining the molecular functions of such proteins. A variety of individual tools exist for protein sequence similarity searches or ortholog detection, delineating co-occurring domains (domain architectures), and building multiple sequence alignments and phylogenetic trees.…”
mentioning
confidence: 99%
“…The sequence-based computational method ProfileView (Vicedomini, Bouly et al 2019) has been recently designed to address the functional classification of the great diversity of homologous sequences hiding, in many cases, a variety of functional activities that cannot be anticipated. ProfileView relies on two main ideas: the use of multiple probabilistic models whose construction explores evolutionary information in large datasets of sequences (Bernardes, Zaverucha et al 2016) (Ugarte, Vicedomini et al 2018) (Vicedomini, Blachon et al 2021) (Fortunato, Jaubert et al 2016) (Amato, Dell'Aquila et al 2017), and a new definition of a representation space where to look at sequences from the point of view of probabilistic models combined together. ProfileView has been previously applied to classify families of proteins for which functions should be discovered or characterized within known groups (Vicedomini, Blachon et al 2021).…”
mentioning
confidence: 99%
“…ProfileView relies on two main ideas: the use of multiple probabilistic models whose construction explores evolutionary information in large datasets of sequences (Bernardes, Zaverucha et al 2016) (Ugarte, Vicedomini et al 2018) (Vicedomini, Blachon et al 2021) (Fortunato, Jaubert et al 2016) (Amato, Dell'Aquila et al 2017), and a new definition of a representation space where to look at sequences from the point of view of probabilistic models combined together. ProfileView has been previously applied to classify families of proteins for which functions should be discovered or characterized within known groups (Vicedomini, Blachon et al 2021). It was proven very successful in identifying functional differences between otherwise phylogenetically similar sequences.…”
mentioning
confidence: 99%