2011
DOI: 10.1093/nar/gkr948
|View full text |Cite
|
Sign up to set email alerts
|

InterPro in 2011: new developments in the family and domain prediction database

Abstract: InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

3
736
1
2

Year Published

2012
2012
2017
2017

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 940 publications
(749 citation statements)
references
References 28 publications
3
736
1
2
Order By: Relevance
“…In addition to homology, there exist many AFP methods that exploit additional information extracted from the genome sequence, e.g., conserved gene neighborhoods (Ling et al, 2009), phylogenetic distribution (Pellegrini et al, 1999), protein motifs and biophysical properties (Ofer and Linial, 2015), codon usage biases (Kriško et al, 2014), remote homology (Hawkins et al, 2009;Sokolov and Ben-Hur, 2010), and composition of protein domains (Hunter et al, 2011;Punta et al, 2011). Moreover, inference using genomic information can be further supplemented by experimental data: gene expression (Tian et al, 2008), protein-protein interactions (Cao and Cheng, 2015) or protein structure (Wass et al, 2012), and also by text-mining the scientific literature .…”
Section: Introductionmentioning
confidence: 99%
“…In addition to homology, there exist many AFP methods that exploit additional information extracted from the genome sequence, e.g., conserved gene neighborhoods (Ling et al, 2009), phylogenetic distribution (Pellegrini et al, 1999), protein motifs and biophysical properties (Ofer and Linial, 2015), codon usage biases (Kriško et al, 2014), remote homology (Hawkins et al, 2009;Sokolov and Ben-Hur, 2010), and composition of protein domains (Hunter et al, 2011;Punta et al, 2011). Moreover, inference using genomic information can be further supplemented by experimental data: gene expression (Tian et al, 2008), protein-protein interactions (Cao and Cheng, 2015) or protein structure (Wass et al, 2012), and also by text-mining the scientific literature .…”
Section: Introductionmentioning
confidence: 99%
“…Only transcripts with a C-score ≥ 0.5, and peptide coverage ≥ 0.5, were retained. Finally, gene models with more than 30% of their coding peptides annotated as Pfam 65 or Interprot 66 TE domains were filtered out. Functional annotation of protein-coding genes was achieved using BLASTP 67 (E-value 1e-05) against two integrated protein sequence databases; SwissProt and TrEMBL 68 .…”
mentioning
confidence: 99%
“…Functional annotation of protein-coding genes was achieved using BLASTP 67 (E-value 1e-05) against two integrated protein sequence databases; SwissProt and TrEMBL 68 . Protein domains were annotated by searching against the InterPro (V32.0) 66 and Pfam (V27.0) databases 65 , using InterProScan (V4.8) and HMMER 69 (V3.1), respectively. The Gene Ontology 70 (GO) terms for each gene were obtained from the corresponding InterPro or Pfam entry.…”
mentioning
confidence: 99%
“…To identify protein domains, the nucleotide sequences were conceptually translated via ESTScan (Lottaz et al 2003) and subsequently analysed via InterProScan (Hunter et al 2011). Parasitiformes mRNA and UniGene sequences were downloaded from NCBI to generate the customized scoring matrix used in ESTScan translation.…”
Section: Functional Sequence Annotationmentioning
confidence: 99%