Inspiration
I'm a second-year preparatory student in Tunisia — my curriculum is calculus, physics, and thermodynamics. But I kept thinking about one question: how far can you push a system where AI agents collaborate autonomously to produce something real — not a chatbot, not a summarizer, but a complete, working GitHub repository generated from nothing but a description?
That question became HackFarmer.
What It Does
HackFarmer takes a project description — plain text, PDF, or DOCX — and runs it through a pipeline of 8 specialized LangGraph agents. The output is a real GitHub repository, pushed live, containing a full-stack codebase with frontend, backend, and business documentation.
The pipeline:
- Analyst parses the input, identifies domain, user personas, and core features
- Architect designs the tech stack, folder structure, and API contracts
- Frontend + Backend + Business agents run in parallel — React components, FastAPI routes, and README/pitch docs generated simultaneously
- Integrator combines the parallel outputs into a coherent codebase
- Validator runs pure Python AST analysis (no LLM) and scores the output 0–100
- If score < 70, the pipeline routes back to the Integrator with structured feedback — automatically, up to 3 retries
- GitHub agent calls the Git Trees API directly to push the entire file tree
in a single API call — no
gitCLI, no subprocess
The user watches every step happen live via Appwrite Realtime WebSocket streaming.
How I Built It
Agent orchestration — LangGraph StateGraph with a shared ProjectState
TypedDict flowing through every node. Parallel fan-out uses LangGraph's Send
router. The retry loop is a conditional edge on the Validator node.
LLM resilience — A custom LLMRouter class manages a per-agent priority chain
across Gemini 2.0 Flash, Groq llama-3.3-70b, and OpenRouter. Different agents use
different primary providers based on empirical testing (e.g. the Business agent
uses llama-3.1-8b on Groq — smaller model, much faster, perfectly adequate for
docs). Each provider gets 120s timeout and 2 retries before cascading to the next.
Concurrency — asyncio.Semaphore(3) caps concurrent pipelines globally, with
an asyncio.Queue for overflow. A startup crash guard resets any jobs stuck in
running state on dyno restart — critical for Heroku's ephemeral environment.
Security — User API keys encrypted at rest with Fernet AES-128, decrypted only
at execution time. IDOR protection on every route (userId ownership check → 403).
Rate limiting via slowapi: 10 jobs/hr, 30/hr for key operations.
Real-time — Appwrite Realtime WebSocket subscription on the job-events
collection, with a 3-second polling fallback if the WebSocket drops.
Deployment — Single Heroku dyno. FastAPI serves the Vite-built React frontend
as static files. Two buildpacks (Node.js + Python) run in sequence via a release
phase build.sh. CI via GitHub Actions with branch protection on main.
Challenges
The silent queue bug — The queue manager updated job status to running but
never actually invoked the pipeline. Python didn't raise an error. Jobs sat frozen
forever. The fix required persisting rawText in the database on job creation and
having the poller read and execute it — a lesson about never separating a status
update from the action that should follow it.
LangGraph TypedDict serialization — Setting state keys at runtime without declaring them in the TypedDict caused LangGraph's checkpointing to silently drop them. The GitHub agent crashed reading a key that had been serialized out. Lesson: treat TypedDict like a strict schema, declare everything upfront.
LLM provider overload — Using the same primary provider for all agents meant one rate limit event degraded the entire pipeline simultaneously. Distributing primary providers across agents means a Gemini rate limit only affects 2 of 8 agents, not all of them.
Single-dyno deployment — Getting Python and Node.js to coexist on one Heroku dyno with the right buildpack order, release phase build, and FastAPI SPA fallback routing took significant iteration.
What I Learned
Building a production multi-agent system taught me things no classroom covers: async concurrency edge cases, encrypted secrets management, LLM provider reliability engineering, real-time WebSocket architecture, and the gap between "it works locally" and "it works under load on a cold-start dyno."
I'm still in prépa. I built this before engineering school.
Log in or sign up for Devpost to join the conversation.