Inspiration

Every AI tool that promises to help your team ships your data to the cloud. Your code, your emails, your documents — all of it processed on someone else's servers. We wanted to build the opposite: an AI that knows everything about your team without knowing anything about you

What it does

Swarm turns your team's devices into a private AI mesh. Ask a question — it searches across every connected laptop, your shared GitHub repo, and team documents simultaneously, synthesizes an answer using a local LLM, and streams it back. No data ever leaves your network.

How we built it

A FastAPI orchestrator runs on the AI PC, coordinating federated RAG across data nodes connected over Tailscale. Each node indexes its own local files using nomic-embed-text embeddings and responds to retrieval requests. The orchestrator fans out queries in parallel, re-ranks chunks globally, and runs llama3.1:8b locally for synthesis. The web UI is Next.js deployed on Vercel — it connects to the orchestrator via ngrok and streams tokens over WebSocket. The Rubik Pi acts as the shared team knowledge node running GitHub MCP. An Arduino UNO Q provides physical status feedback. Voice input works via WebRTC on mobile.

Challenges we ran into

Getting true federated retrieval working across heterogeneous devices over Tailscale with consistent latency was the core engineering challenge. JWT-based node authentication with role-scoped access required careful design so personal nodes are never queried by unauthorized users. Streaming tokens end-to-end from Ollama through the orchestrator through ngrok to the browser required careful handling of backpressure and connection drops.

Accomplishments that we're proud of

A fully working private AI mesh across real physical hardware — Rubik Pi, two laptops, a phone, and an Arduino — with zero cloud AI calls. The mesh canvas UI where you lasso-select nodes to query is genuinely novel. The security level system (A through ZZ) inherited from our teammate's Aligner RAG engine gives enterprise-grade access control on consumer hardware.

What we learned

Distributed inference across heterogeneous devices is a networking problem first and an AI problem second. Tailscale solved the networking. The hard part was making the system degrade gracefully when nodes go offline mid-query.

What's next for Swarm — Private AI mesh for your team

LoRA fine-tuning on the Snapdragon X Elite NPU using Windows AI Foundry to personalize the model on your team's data without any cloud training. A Tauri desktop app that installs everything automatically — pick Orchestrator or Data Node, select your folders, and you're in the mesh.

Built With

Share this project:

Updates