🧬 Nightingale Labs — Evidence-First In-Silico Drug Experiment Simulator

đź’ˇ Inspiration

  • The average drug takes 10 years to get from bench to bedside and costs on average $1B to develop.
  • Wet-lab loops are slow; literature is vast.
  • We wanted a way to ask a precise biological question and get evidence-backed, uncertainty-aware projections—without heavy bespoke ML models or physics simulators.
  • The idea: use a strong search + reasoning API to mine the literature, then simulate likely experimental outcomes directly from that evidence. Perplexity’s API gave us a fast lane from papers → structured claims → plausible experiment readouts.

⚙️ What It Does

  • Nightingale Labs lets researchers pose a question, e.g.: “In IFN-low neuroendocrine cells, will a CK2 inhibitor increase MHC-I?”

The platform:

  • Searches & synthesizes evidence using the Perplexity API (papers, trials, reviews).
  • Extracts effect signals — e.g., “↑ MHC-I ~10–30% in related contexts,” “mild apoptosis at ≥1 µM.”
  • Runs a lightweight experiment simulator that converts those signals into projected curves (time-series, dose–response bands) with credible intervals.
  • Returns plots + a narrative that cite sources and explain assumptions.
  • Uses feedback to update results.

đź§© How We Built It

đź”­ System at a Glance

  • Orchestrator (FastAPI + worker): Receives the question, calls Perplexity, runs the simulator, assembles a report.
  • Evidence Engine (Perplexity API):
  • Uses structured prompts to extract: population/context, compound/dose, assay/readouts, directionality, magnitude, timing, confidence, citations.
  • Normalizes effect sizes (e.g., % change, logâ‚‚ FC, odds ratios) to a common internal schema.
  • Drug Experiment Simulator (rules + uncertainty):
  • Converts evidence into prior distributions and Monte Carlo projections. Includes:
  • Dose–response (Emax) linkage
  • Time-to-effect priors (rise/decay) from reported kinetics
  • Outputs of median and 95% CrI for each readout over time and dose
  • Reporter:
  • Plots trajectories (MHC-I, viability/apoptosis, pathway proxies)
  • Lists assumptions
  • Inlines citations

📦 Data Contracts

  • Question JSON: cell context, perturbation, doses, readouts
  • Evidence JSON: claims with effect size, units, uncertainty, and citation metadata
  • Simulation Spec: priors over (Emax, EC50, h); kinetics (tau_rise, tau_decay); noise model
  • đź§Ş Example Flow (Silmitasertib-style Question)
  • Perplexity returns snippets indicating CK2 inhibition → ↑ antigen presentation in IFN-modulated contexts; mild cytotoxicity at higher doses.
  • Engine sets priors (e.g., Emax for MHC-I, EC50, h) with an IFN-context modifier.
  • Simulator samples 10k trajectories per dose (e.g., 10 / 100 / 1000 nM) → produces bands for MHC-I and viability.
  • Report explains why (citations), how sure (CrIs), and what to test next.

đźš§ Challenges We Ran Into

  • Planned on integrating Google's Cell2Sentence (C2S) framework (link: https://www.biorxiv.org/content/10.1101/2025.04.14.648850v2) for ability to process images and utilise a medicine specific Agent. Struggled to get Google cloud agent integration.
  • Also planned to integrate PhysiCell and PhysiPKPD experimental simulators to get higher quality assessment of likely results of candidate drug selection

  • Heterogeneous reporting: papers mix endpoints/units → built robust unit harmonization and effect-size normalization.

  • Context transfer: mapping literature contexts to user setup (cell type, IFN baseline) without over-claiming; used explicit context similarity scoring and down-weighted mismatches.

  • Uncertainty plumbing: keeping priors honest when evidence is sparse or contradictory; simulator widens CrIs and flags low-confidence assertions.

🏆 Accomplishments We're Proud Of

  • Feedback from current Biomedical PhDs report they find this a useful tool and has produced some interesting research areas to pursue
  • One of many examples of AI providing new insights into medical research. Not just summarising information but producing new ideas for medical research.
  • Full paper → projection loop with source-linked assumptions — zero custom ML training.
  • Clean schemas and a YAML Simulation Spec so scientists can audit priors and knobs.
  • Minutes-to-insight projections (dose–response curves + narratives) that speed up wet-lab planning.

đź§  What We Learned

  • Evidence structure > model complexity: reliable extraction + principled priors beat black-box predictions.
  • Show the bands: credible intervals shift the conversation from “Is it true?” to “What should we test to collapse uncertainty?”
  • Context is a coefficient: explicit similarity weights prevent overgeneralizing literature to mismatched systems.

🚀 What’s Next for Nightingale Labs

  • Better extraction: table/figure parsers; auto-unit detection; contradiction detection.
  • Richer priors: hierarchical meta-analysis across cell types/assays; co-perturbation handling.
  • Design of Experiments (DoE): suggest the next best experiment to reduce uncertainty.
  • Interactive UI: sliders for priors, instant re-projection, one-click CSV/PDF export.
  • Validation loop: compare projections with new wet-lab results; continuously update priors.

đź§ľ Project Overview

  • Nightingale Labs is an AI-assisted literature-to-simulation tool: you ask a question → it synthesizes evidence via Perplexity → produces plausible experimental readouts (with uncertainty) you can use to plan assays.

đź§° Tech Stack

  • Backend/Orchestrator: Python (FastAPI), task queue
  • Evidence: Perplexity API (search + synthesis), JSON citation graph
  • Data: JSON/YAML specs; CSV outputs; optional DVC for artifacts
  • Frontend: Streamlit dashboard
  • Cloud: Containers on Cloud Run (or VM); Secret Manager; GCS for results

🔑 Core Features

  • Structured question intake (context, doses, readouts)
  • Evidence synthesis (citations, effect sizes, confidence)
  • Evidence-driven projections (dose–time curves + 95% CrIs)
  • Narrative + assumptions (transparent priors, context similarity, limitations)
  • Downloadables (CSV/PDF) and reproducible YAML spec

đź§± Development Status

  • âś… Done: schemas; Perplexity prompts; normalization; simulator core; reporting
  • ⚙️ In progress: contradiction detection; auto-unit conversion; CLI demo
  • 🔜 Next: DoE recommender; UI; validation on benchmark interventions
  • đź”’ Guidelines, Security, Performance & Validation
  • Guidelines: PEP8; config-over-code; explicit assumptions in YAML
  • Security: HTTPS; secret isolation; least-privilege IAM; input sanitization
  • Performance: aggressive evidence caching; vectorized Monte Carlo; slim Docker images
  • Validation: unit tests for extraction/normalization; in silico ablations; prospective checks against new wet-lab data

đź§­ TL;DR

Nightingale Labs converts search-derived evidence into uncertainty-aware experimental projections, helping scientists pick doses, timepoints, and readouts that matter — no heavyweight models required.

Built With

Share this project:

Updates