The heterologous overexpression of integral membrane proteins in Escherichia coli often yields insufficient quantities of purifiable protein for applications of interest. The current study leverages a recently demonstrated link between co-translational membrane integration efficiency and protein expression levels to predict protein sequence modifications that improve expression. Membrane integration efficiencies, obtained using a coarse-grained simulation approach, robustly predicted effects on expression of the integral membrane protein TatC for a set of 140 sequence modifications, including loop-swap chimeras and single-residue mutations distributed throughout the protein sequence. Mutations that improve simulated integration efficiency were 4-fold enriched with respect to improved experimentally observed expression levels. Furthermore, the effects of double mutations on both simulated integration efficiency and experimentally observed expression levels were cumulative and largely independent, suggesting that multiple mutations can be introduced to yield higher levels of purifiable protein. This work provides a foundation for a general method for the rational overexpression of integral membrane proteins based on computationally simulated membrane integration efficiencies.
Integral membrane proteins (IMPs)4 play crucial roles in the transport of molecules, energy, and information across the membrane and are an important focus of structural and biophysical studies. However, the production of sufficient levels of IMPs is a limiting factor in their characterization (1). Even among homologous IMP sequences, expression levels can vary widely (1-6), and the mechanistic basis for this variability is often unclear. Extensive efforts have been committed to identify IMP sequences, expression conditions, and host modifications that yield IMP expression at sufficient levels for further study (7-10). Despite these efforts, general guidelines for successful overexpression for IMPs are lacking.Biogenesis of IMPs in Escherichia coli involves multiple steps that are potential bottlenecks for overexpression, including correct targeting to the inner membrane (11, 12), membrane integration (2, 13-17), and folding (18 -21). For a given sequence, understanding how each of these steps affects observed expression levels may lead to improved strategies for IMP overexpression.Previous work indicates that the Sec-facilitated membrane integration step of biogenesis is a limiting factor in the overexpression of the TatC IMP (2). Sequence changes in the C-tail that alter the efficiency of membrane integration efficiency, determined either from coarse-grained (CG) simulations or experimentally, were shown to correlate with experimentally observed IMP expression levels. Further work is necessary to explore the generality of this link and its potential for enabling the rational enhancement of IMP expression.The current study demonstrates the predictive capacity of simulated integration efficiency for experimental expression by examining a wide...