2006
DOI: 10.1093/bioinformatics/btl623
|View full text |Cite
|
Sign up to set email alerts
|

Protein solubility: sequence based prediction and experimental verification

Abstract: We present a machine-learning approach called PROSO to assess the chance of a protein to be soluble upon heterologous expression in Escherichia coli based on its amino acid composition. The classification algorithm is organized as a two-layered structure in which the output of primary support vector machine (SVM) classifiers serves as input for a secondary Naive Bayes classifier. Experimental progress information from the TargetDB database as well as previously published datasets were used as the source of tra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

6
138
1

Year Published

2009
2009
2015
2015

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 136 publications
(145 citation statements)
references
References 57 publications
6
138
1
Order By: Relevance
“…Despite testing a wide range of purification methods, buffer conditions and additives, different tags and tag positions, and alternative hosts for protein expression (data not shown), we were unable to obtain soluble purified protein for further analysis. This result is consistent with a high calculated probability of Rv0485 insolubility using PROSO (59). We were therefore unable to experimentally determine whether Rv0485 is able to directly bind DNA.…”
Section: Identification Of Rv0485supporting
confidence: 73%
See 1 more Smart Citation
“…Despite testing a wide range of purification methods, buffer conditions and additives, different tags and tag positions, and alternative hosts for protein expression (data not shown), we were unable to obtain soluble purified protein for further analysis. This result is consistent with a high calculated probability of Rv0485 insolubility using PROSO (59). We were therefore unable to experimentally determine whether Rv0485 is able to directly bind DNA.…”
Section: Identification Of Rv0485supporting
confidence: 73%
“…Sequence alignments were performed using ClustalW2 software (http://www.ebi.ac.uk/Tools/clustalw2/index .html) (34). Protein solubility prediction software (PROSO) was accessed via the Expropriator Web server (http://mips.helmholtz-muenchen.de/proso/proso .seam) (59). Proteins of known structures were queried via the RCSB Protein Data Bank Web server (http://www.rcsb.org/pdb/home/home.do).…”
mentioning
confidence: 99%
“…Moreover, the solubility of these proteins was confirmed by another sequence-based protein solubility evaluator (PROSO server, Smialowski et al [34]). Consequently, the solubility of the fusion protein MBP * CP may not be related to the low expression levels of this protein in E. coli.…”
Section: Discussionmentioning
confidence: 93%
“…The 26-kD GST tag is short (218 aa) [23] and is frequently used as a fusion in molecular biology research. The study from Pawel and co-workers [24] defined soluble protein as the fraction that stayed in solution, did not oligomerize strongly and was stable (did not precipitate or aggregate), while insoluble proteins are defined as those that cannot stay in solution without denaturing agents like urea. Our β-galactosidase protein was successfully purified from the soluble protein extracts of yeast strain CEN.PK2 lysates.…”
Section: Purification Of β-Galactosidase Protein From Saccharomyces Cmentioning
confidence: 99%