Genomic Science Program
U.S. Department of Energy | Office of Science | Biological and Environmental Research Program

Novel Systems Approach for Rational Engineering of Robust Microbial Metabolic Pathways


Laura Jarboe1,* (, Yannick Bomble2, Robert L. Jernigan1, Peter St. John2, Kejue Jia1, Pranav M. Khade1, Ambuj Kumar1, Onyeka Onyenemezu1, Jetendra K. Roy1, and Chao Wu2


1Iowa State University; and 2National Renewable Energy Laboratory


The goal of this project is to develop and implement a process for improving bioproduction under conditions that are appealing for industrial processes, such as high temperature and low pH. This approach addresses the failure of metabolic reactions due to inhibition, denaturation, misfolding or disorder of enzymes. Researchers will develop and implement a framework that identifies these enzymes and then identifies their robust replacement enzymes. The engineering strategy of replacing enzymes to improve bio-production is well-established, but rarely applied to system-wide stressors. Researchers apply a systems genomics approach to improve bioproduction, with Escherichia coli as the model organism. Butanol production at high temperature and succinate production at low pH are the model systems. This approach is complementary to improvement of microbial robustness by engineering the cell membrane and has advantages relative to evolutionary-based organism improvement by prioritizing bioproduction rather than growth.


Temperature Tolerance: Given the increase in thermostability data available from recent proteomics studies, researchers are taking a closer look at which enzymes are limiting growth at higher temperatures using metabolic models. While the melting temperature of an enzyme is useful, knowledge of the range of temperatures a given enzyme can withstand and remain active can help provide a more complete picture of an organism’s sensitivity to temperature. Researchers are using embeddings from protein language models such as esm2 to predict enzyme temperature optimum as well as the melting temperature when not available and have found improvements over previous methods (e.g., ProTstab2, DeepET). Researchers are exploring ways to improve predictive models such as by integrating protein structures with embeddings from the language models. Growth-limiting enzymes in response to increasing temperature in E. coli have been previously identified through the use of genome-scale models (Chang et al. Science 2013). This prior analysis relied on estimated values for many of the key enzyme-specific temperature parameters. In the years since this original study, there have been multiple proteomic studies of enzyme melting temperature (Tm) and researchers have used this data, along with literature reports of enzyme assays and the ProTstab2 Tm predictors (Yang et al. BMC Genomics, 2019) to compile an updated set of E. coli Tm values for use in assessment of enzyme thermosensitivity and prioritizing enzymes for experimental assessment.

Acid Tolerance: The pH tolerance efforts have prioritized modeling the effect of pH on the allocation of cellular resources. In the current version of the model, researchers assume that the cell interior is maintained at near-neutral pH by the energetic investment in proton extrusion via ATP. Researchers have completed a two-stage optimization model. In the first stage, cell growth is maximized with constraints of thermodynamics and total available enzyme protein allocation. The flux distributions associated with this maximum growth are then used in a minimization of overall enzyme protein cost. This model predicts a sharp decrease in specific growth rate when the extracellular pH drops below 4.9. Next, researchers investigated the thermodynamic and kinetic properties of individual enzymes in the model. Generally, low pH will lead to additional cost of enzyme protein to maintain a thermodynamic feasibility and reaction rate, represented by the thermodynamic item and kinetic item, respectively (Figure 2). Interestingly, the glucose transporter and glycolysis enzymes account for the majority of those with the highest enzyme protein requirement especially at low pH.

Researchers have compiled E. coli enzyme pH sensitivity data from the literature and are in the process of assessing the accuracy of the patcHwork sequence/structure-based tool for estimating enzyme pH sensitivity ( Among the enzymes with literature data available, phosphatidylserine synthase (PssA) is especially pH sensitive. Researchers tested the effect of increased expression of PssA on growth at a range of pH values. At values as low as pH 4.0, the increased expression of PssA was beneficial for growth. Efforts to increase the pH tolerance of the PssA enzyme are ongoing.

Modeling of enzyme stability: The investigations of existing predictions of protein stability found them to be unreliable. Researchers are beginning to apply protein language models such as ESM1 to the problem. But have just now found similar DNA codon language models (Outeiral and Deane 2022) that will be more effective for this type of prediction and are developing new computational machine learning models using this type of DNA language model for predicting protein melting temperatures. (This relates to the importance of specific codons for the rate of translation as it affects co-translational folding.) Researchers also plan to extend this type of model to the prediction of optimal pH for E. coli enzymes, which is a relatively underdeveloped problem.

This builds upon the team’s own expertise in utilizing protein language models to make major improvements in protein sequence matching and function prediction (Kilinc, Jia, and Jernigan 2023).

Funding Information

This research was supported by the DOE Office of Science, Office of Biological and Environmental Research (BER), grant no. DE-SC0022090.

This work was authored in part by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided U. S. Department of Energy, Office of Science, through the Genomic Science program, Office of Biological and Environmental Research under FWP ERW3526. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government.