“…For instance, the probability of expressing the protein in soluble form was inversely correlated to the size of protein [28] , thereby hinting that Seq_len is an essential feature. Besides, the composition of amino acid was found to be a critical factor inducing the metabolic stress during RPP in E. coli [29] ; hence, the expression of recombinant protein can be improved by adjusting the amino acids composition [30] . The present study revealed specifically that the occurrences and occurrence frequencies of amino acids such as Occ_E, Occ_V, OF_E, OF_S, OF_F, OF_M, MNC_A, OF_ MNC_A, and OF_MNC_C are the significant factors for the soluble protein expression in the periplasm of E. coli .…”