Engineering the strains, rapidly.
Problem description
Metabolic engineering enables efficient development of microbial strains for the production of biochemicals and bio-based materials by re-routing cellular fluxes. The increasing availability of biological data as well as high-throughput strain construction and screening workflows provide unique opportunities to speed-up the metabolic engineering process using machine learning approaches.
In fact, recently, multiple machine learning methods have been developed to support several of the major steps in metabolic engineering. With respect to the problem, three technological research trends were identified, where the key challenge is defined as: “Can we combine these technologies with advanced modeling approaches to strengthen metabolic engineering workflows?”:
- Metabolic flux optimization: machine learning methods for efficient exploration of combinatorial pathway optimization and bioprocess optimization. In addition, selection/design of genetic parts (e.g., gene variants) or genome wide host optimization.
- Optimization of genome engineering tools: how can machine learning methods help to improve genome engineering tools: e.g., increase editing efficiencies, reduce off-target effects?
- Development of ultra-HT screening approaches: machine learning based interpretation of ultra-high throughput screening data (e.g. FACS, imaging) to improve selection and, or strain diagnostics.
References
-
Systems metabolic engineering strategies: integrating systems and synthetic biology with metabolic engineering
-
Machine learning applications in systems metabolic engineering
-
WEBINAR: Engineering Cellular Metabolism using Machine Learning
Leveraging advanced mechanistic modelling and the generation of high-quality multi-dimensional data sets, machine learning is becoming an integral part of understanding and engineering living systems.
Read more -
WEBINAR: Helix engineering: combining the power of 3DM with AI to disrupt protein engineering
The high dimensionality and practically infinite size of the sequence space requires effective techniques to explore, navigate and improve proteins. Current techniques are underwhelming in their accuracy and ability to find novel variants. With its 3DM technology Bio-Prodict has long been at the forefront of providing protein engineering solutions.
Read more -
WEBINAR: End-to-end experimental and machine learning workflows for predictive genetic design
High-throughput experiments combined with emerging machine learning (ML) technologies are enabling data-centric biological design workflows for synthetic biology.
Read more -
Simulated Design−Build−Test−Learn Cycles for Consistent Comparison of Machine Learning Methods in Metabolic Engineering
Combinatorial pathway optimization is an important tool in metabolic flux optimization. Simultaneous optimization of a large number of pathway genes often leads to combinatorial explosions. Strain optimization is therefore often performed using iterative design–build–test–learn (DBTL) cycles. The aim of these cycles is to develop a product strain iteratively, every time incorporating learning from the previous cycle. Machine learning methods provide a potentially powerful tool to learn from data and propose new designs for the next DBTL cycle. However, due to the lack of a framework for consistently testing the performance of machine learning methods over multiple DBTL cycles, evaluating the effectiveness of these methods remains a challenge. In this work, we propose a mechanistic kinetic model-based framework to test and optimize machine learning for iterative combinatorial pathway optimization. Using this framework, we show that gradient boosting and random forest models outperform the other tested methods in the low-data regime. We demonstrate that these methods are robust for training set biases and experimental noise. Finally, we introduce an algorithm for recommending new designs using machine learning model predictions. We show that when the number of strains to be built is limited, starting with a large initial DBTL cycle is favorable over building the same number of strains for every cycle.
Read more
Goal
To determine how AI/ML methods can support efficient exploration of a large solution space for (iterative) strain improvement. Further, to demonstrate the added value of such AI methods based on test-cases applying to multiple iterative rounds of strain improvement for a relevant case for producing a food, feed or nutritional ingredient at industrial scale.