Low solubility of proteins overexpressed in E. coli is a frequent problem in high-throughput structural genomics. To improve solubility of proteins from mesophilic Shewanella oneidensis MR-1 and thermophilic Clostridium thermocellum JW20, an approach was attempted that included a fusion of the target protein to a maltose-binding protein (MBP) and a decrease of induction temperature. The MBP was selected as the most efficient solubilizing carrier when compared to a glutathione S-transferase and a Nus A protein. A tobacco etch virus (TEV) protease recognition site was introduced between fused proteins using a double polymerase-chain reaction and four primers. In this way, 79 S. oneidensis proteins have been expressed in one case with an N-terminal 30-residue tag and in another case as a fusion protein with MBP. A foreign tag might significantly affect the properties of the target polypeptide. At 37 degrees C and 18 degrees C induction temperatures, only 5 and 17 tagged proteins were soluble, respectively. In fusion with MBP 4, 34, and 38 proteins were soluble upon induction at 37 degrees, 28 degrees, and 18 degrees C, respectively. The MBP is assumed to increase stability and solubility of a target protein by changing both the mechanism and the cooperativity of folding/unfolding. The 66 C. thermocellum proteins were expressed as fusion proteins with MBP. Induction at 37 degrees, 28 degrees, and 18 degrees C produced 34, 57, and 60 soluble proteins, respectively. The higher solubility of C. thermocellum proteins in comparison with the S. oneidensis proteins under similar conditions of induction correlates with the thermophilicity of the host. The two-factor Wilkinson-Harrison statistical model was used to identify soluble and insoluble proteins. Theoretical and experimental data showed good agreement for S. oneidensis proteins; however, the model failed to identify soluble/insoluble Clostridium proteins. A suggestion has been made that the Wilkinson-Harrison model is not applicable to C. thermocellum proteins because it did not account for the peculiarities of protein sequences from thermophiles.