Inspiration

Every team I've worked with has the same pile: four thousand survey answers, app reviews, support tickets — and somebody read forty of them before averaging the rest into a score. The averages kill the outliers, and the outliers are where the truth lives. One evening I watched a starling murmuration video and realized that's what feedback should look like: thousands of individual voices, visibly moving as one shape, where a single bird flying alone is the most interesting thing in the sky.

What it does

Murmur turns any pile of qualitative text into a living murmuration. Paste raw voices (one per line) or drop a CSV — no account, no setup. A Gemini agent then does the analyst's work visibly, through the MongoDB MCP server, with every real tool call ticking across the screen with its measured latency:

  • Every voice becomes a bird (one document = one bird, born on screen by a MongoDB change stream event as its batch lands).
  • The agent embeds everything with gemini-embedding-001, creates an Atlas Vector Search index through MCP, clusters the embeddings, and names each flock — your data has a shape before you've read a word.
  • Type a question — "what do people actually hate?" — and the whole sky reorganizes: a real $vectorSearch runs through the MCP aggregate tool, and the birds sweep into bands that are literally that query's score distribution, with a one-line grounded answer (with citations) settling under the sky.
  • The lone bird: voices far from every flock fly alone, glowing amber. The most isolated voice is the most visible object on screen — the exact inversion of what averages do.
  • + add a voice inserts a genuinely new document; the change stream pushes it to the browser as a newborn bird, with the measured insert-ack → pixel latency on screen. Judge-verifiable, unfakeable.

Try the demo sky: 4,182 real NYC 311 service requests, pre-seeded and one click from the landing page.

How I built it

  • The sky: three.js InstancedMesh boids — up to 10,000 birds in one draw call at 60 FPS, classic Reynolds steering plus per-bird attraction to its flock anchor. No dashboard exists; the murmuration is the entire interface.
  • The analyst: a Node agent driving Gemini on Vertex AI (Gemini 3 Flash, with gemini-embedding-001 for 768-d embeddings, batched 250 texts per call). Every database action — schema inspection, insert-many, create-index (type vectorSearch), $vectorSearch aggregations, flock/outlier writes, deletion — goes through the official mongodb-mcp-server over stdio, and the on-screen ticker prints those calls verbatim with real measured milliseconds.
  • The data: MongoDB Atlas (free M0) — voices, flocks, outliers, skies, geometry collections; Atlas Vector Search for all semantic geometry; aggregation pipelines for flock stats; change streams for liveness; TTL indexes so skies politely forget themselves.
  • Clustering: k-means with silhouette-score auto-k plus PCA projection, computed server-side in plain JavaScript.
  • Hosting: one container on Cloud Run serving the app, the API, the WebSocket hub, and the agent.

Challenges I ran into

  • Pure-JS clustering at 768 dimensions was too slow. Fix: a fixed 32-d Johnson–Lindenstrauss projection for k-means/silhouette/PCA — the geometry is still computed from the real embeddings, never canned.
  • MCP server quirks in production. The deployed mongodb-mcp-server build auto-connects from its connection string and disables its connect tool, so the mission's first ticker line had to honestly be the first real call. And where an MCP build can't create search indexes, the driver creates the vector index as a documented fallback — the agent still drives everything else through MCP.
  • Free-tier discipline. M0 allows 3 search indexes (Murmur uses one, filtered by sky), embedding calls cap at 250 texts, and a sky caps at 5,000 voices. The constraint became the show: a 4–5k pour forms in about a minute, and the wait — birds spawning batch by batch — is the best part.
  • Making motion mean something. The re-flock had to be the actual query result rendered as movement, not decoration: band thresholds come from the real $vectorSearch score distribution, and the ms figure on screen is the measured call time.

Accomplishments that I'm proud of

  • The outlier inversion: analytics that makes the least average voice the most visible.
  • Total honesty of the spectacle — every ticker line is a verbatim MCP call with its true latency, every band is a real score, every birth is a measured change-stream roundtrip.
  • 10,000 birds, one draw call, 60 FPS on an ordinary laptop.
  • A stranger can verify liveness in ten seconds: add a voice, watch it get born, read the Δ.

What I learned

  • Vector search can be choreography: a $vectorSearch score distribution is expressive enough to be the interface.
  • MCP is a genuinely good shape for "agent hands" — and because every call is observable, you can put the agent's work on stage instead of behind a spinner.
  • Free-tier limits are design material, not obstacles.

What's next

  • Bring-your-own Atlas (the Roost settings page already shows it; new pours would land on your cluster).
  • Sky-to-sky comparison — last quarter's murmuration beside this quarter's.
  • Watching specific flocks over time: subscribe to a theme, get told when a new bird joins it.

Built With

Share this project:

Updates