Inspiration

Families can download raw DNA files from 23andMe/Ancestry—but then what? Most dashboards either overwhelm people with jargon or stop at vague trait lists. We wanted to turn overwhelming scientific and medical data into clear, actionable insights a family can actually use together.

The technical spark came from ADAGIO, a disease–gene prioritization approach based on network biology. In many diseases, genes work in pathways; ADAGIO runs a random-walk propagation on a protein–protein network starting from known disease genes to produce a relevance score for every gene. We asked: what if we precompute those scores for common diseases and make them usable in a friendly web app?


What it does

GeneGuard lets a user upload their existing raw genomics file (23andMe/Ancestry TXT or VCF) and then:

  • Choose a disease (e.g., Alzheimer’s, T2D) or auto-rank all diseases by aggregate risk.
  • We map the file’s rsIDs → genes, intersect with our precomputed ADAGIO tables, and return a ranked list of relevant genes with High/Medium/Low levels (by rank bands).
  • For each hit, we generate five concise, evidence-anchored lifestyle suggestions (WHO/NIH/CDC style wording) to turn insights into action.
  • Export results as CSV and (optionally) share with family to compare overlaps.

Note: Research-grade, not diagnostic; we display a clear disclaimer encouraging follow-up with a genetic counselor.


How we built it

  • Frontend: React.js with a clean upload flow, “How It Works” page, and a results table (expand/collapse per gene for tips), plus CSV export.
  • Backend: FastAPI (Python), deployed on Render.
    • Endpoints:
    • GET /diseases — available diseases
    • POST /upload-genome — parse TXT/VCF, map rsIDs → genes via MyVariant.info, intersect with ADAGIO tables, return ranked genes + tips
    • POST /auto-rank — score all diseases and return top-N by aggregate ADAGIO score
    • GET /results/{id}/csv — CSV export of a prior result
    • Parsing & annotation: TXT (rsIDs) and VCF (streamed via cyvcf2); both resolve to gene symbols.
    • ADAGIO integration: precomputed JSON per disease (adagio_{disease}.json) with {gene: {risk, rank}}.
    • Risk levels: rank bands (Top 100 = High, 101–300 = Medium, 301–500 = Low).
    • Performance: in-memory ADAGIO cache, parallel disease scoring, tip generation only for displayed results, and graceful API timeouts.

Challenges we ran into

  • Bridging biology and CS: Translating network propagation + variant annotation into outputs a non-expert can trust and understand—without overpromising clinical meaning.
  • Data plumbing in real time: Robustly mapping rsIDs to genes from mixed TXT/VCF formats and handling API rate limits.
  • Latency vs. depth: Balancing precomputation (ADAGIO) with just-in-time annotation; we added caching and parallelism to keep the UI snappy.
  • First hackathon for half of us: New tools, new workflows, shipping under a tight clock while aligning design, backend, and demo.

Accomplishments that we’re proud of

  • Shipped end-to-end: polished React UI + robust FastAPI backend, fully deployed and demo-ready.
  • Under-the-hood quality: clean API boundaries, streaming VCF parsing, cached annotations, and parallel top-k scoring for responsiveness.
  • Great UX polish: intuitive upload flow, clear risk levels, expandable tips, and easy CSV export.
  • Scientific framing: we made network biology approachable—precomputing ADAGIO scores so families get insights in seconds.

What we learned

  • How to operationalize a research method (ADAGIO) into an API with real latency constraints.
  • Practical genomics I/O: handling different file formats, edge cases, and mapping pipelines.
  • The impact of precomputation + caching and where parallelism pays off most for perceived speed.
  • Clear disclaimers and tone matter: users want actionable guidance, not anxiety or false certainty.
  • Team skills: faster scoping, defining interfaces early, and iterating frontend ↔ backend in lockstep.

What’s next for GeneGuard

  • More diseases and phenotype panels; expand and version ADAGIO tables.
  • Scalability: move from in-memory store to a lightweight DB, background jobs for heavy annotations, smarter caching.
  • Richer insights: pathway-level explanations, polygenic-style aggregation, cohort benchmarks (e.g., “siblings share X high-priority genes”).
  • Clinician-friendly export: one-page summary PDF with clear caveats and references.
  • Privacy & sharing: invite links with granular scopes; optional encryption at rest.
  • Accessibility: broader file support, glossary of terms, and more educational “What this means” content.

Built With

Share this project:

Updates