Inspiration

The rapid growth of genome sequencing has created a gap between available biological data and its practical use in bioprocess engineering. While Genome-Scale Metabolic Models (GEMs) are powerful tools for understanding and optimizing microbial systems, they remain largely inaccessible to non-specialists. We were inspired to bridge this gap by creating a tool that translates complex metabolic modeling into an intuitive workflow for optimizing fungal growth conditions. Additionally, the growing industrial relevance of fungi in food, materials, waste processing, and pharmaceuticals motivated us to focus on enabling more efficient and scalable bioprocess design.

What it does

Predict is an end-to-end pipeline that takes a genome as input and outputs optimized growth media conditions for filamentous fungi. It automates:

  • Genome-to-GEM reconstruction
  • Simulation of metabolic fluxes under varying media conditions
  • Optimization of growth rate and biomass production
  • Reduction of byproducts through constraint-based modeling

The system reduces the experimental search space by identifying high-performing media compositions in silico, enabling faster and cheaper bioprocess development.

How we built it

We designed Predict as a modular computational pipeline:

  1. Model Generation: Using automated reconstruction tools (e.g., CarveMe), we generate a GEM from genomic data.
  2. Preprocessing: The model is cleaned, constraints are standardized, and exchange reactions are configured to simulate media conditions.
  3. Simulation Engine: We use COBRApy for Flux Balance Analysis (FBA) and Flux Variability Analysis (FVA).
  4. Sampling: A polytope sampler explores the feasible metabolic space to understand variability and robustness.
  5. Optimization: Different media compositions are systematically tested to maximize biomass while minimizing waste.
  6. Output Layer: Results are structured for visualization and eventual UI integration.

Each component is designed to be interoperable and extensible, enabling future integration into a user-facing interface.

Challenges we ran into

  • Model Quality: Automatically generated GEMs often contain gaps or inconsistencies, requiring preprocessing.
  • Media Representation: Translating real-world growth conditions into model constraints is non-trivial, especially for fungi.
  • Objective Function Issues: Ensuring the correct biomass reaction and avoiding invalid objectives required debugging.
  • High-Dimensional Search Space: Efficient exploration without combinatorial explosion was challenging.
  • Tooling Limitations: Many tools are optimized for bacteria, not fungi.

Accomplishments that we're proud of

  • Built a working pipeline from genome input to media optimization output
  • Integrated multiple tools (CarveMe, COBRApy, sampling algorithms)
  • Developed a robust method for applying media constraints across models
  • Demonstrated how GEMs reduce experimental complexity
  • Positioned the system for real-world industrial applications

What we learned

  • GEMs are powerful but require careful handling to be usable for non-specialists
  • Growth conditions are as important as the metabolic network itself
  • Constraint-based modeling is highly sensitive to parameter choices
  • Bridging biology and software engineering requires careful abstraction
  • Iterative debugging is essential when working with biological models

What's next for Predict - Pacifico Biolabs Challenge

  • User Interface: Build an intuitive UI for non-specialists
  • Fungal-Specific Improvements: Use curated fungal GEMs and improved gapfilling
  • Optimization Algorithms: Introduce smarter search methods (e.g., Bayesian optimization)
  • Validation: Compare predictions with experimental data
  • Scalability: Enable batch processing and cloud deployment

Built With

  • cobrapy
  • dingo
  • fastapi
  • modelseedpy
  • polyround
  • python
  • sklearn
  • streamlit
Share this project:

Updates