Echoes - every place has a story

Inspiration

Brooklyn College sits on a former golf course. President Roosevelt laid the cornerstone in 1936. Eighteen hundred families were displaced from Sands Street when Robert Moses came through. The Navy Yard employed 70,000 people during World War II, including the first women ever hired as mechanics.

None of this is hidden. It's all in the archives, the newspapers, the public records. It's just never been made audible.

Walking around New York, you're constantly standing on top of history that nobody talks about anymore. Echoes started as a simple question: what if the city could tell you its own story?

What it does

Echoes is a location-based audio storytelling platform for New York City. Tap any spot on the map, search any address or landmark, or speak a location out loud, and within seconds you hear a first-person narrative from a fictional voice of someone who actually lived there.

Each story is grounded in real historical sources pulled live from the web. The narrator's voice is automatically matched to the era and character of the story. Every session leaves a trail across the map of everywhere you've been, and when a story ends, the city suggests where to go next.

How we built it

The core pipeline runs like this:

  1. A location reaches the story pipeline either from a map tap, text search, or voice input transcribed by ElevenLabs Scribe.
  2. Nominatim translates map taps into real street addresses, and search queries into map coordinates.
  3. Tavily searches public archives, newspaper databases, and historical records for that location in real time
  4. Claude reads those sources and generates a first-person narrative, picking an era, a narrator persona, a title, an intro, and a factual context summary with explicit instructions to never invent facts and always cite what it found
  5. ElevenLabs synthesizes two audio clips in parallel: a documentary-style intro and the personal story, each voiced by a character matched to the narrator's age, background, and era
  6. The frontend slides up the story card with the text, an animated waveform, and the real source titles Tavily found

The whole stack is Next.js App Router on the frontend, Leaflet.js with CartoDB dark tiles for the map, and three API routes handling story generation, voice transcription, and geocoding. Everything deploys on Vercel with zero backend infrastructure.

Challenges

Getting the voice to sound human. Early versions had ElevenLabs reading in a flat, textbook cadence. The fix was two-fold: rewriting the Claude prompt to generate fragmented, conversational speech patterns rather than grammatically complete sentences, and using SSML break tags to insert natural pauses after every sentence. The difference between "She worked at the Navy Yard during the war." and "She worked the Yard. During the war. Can you believe that." is everything.

Making history feel personal, not encyclopedic. The first few stories Claude generated were accurate but emotionally flat, they read like Wikipedia. The breakthrough was adding a concrete example of bad vs good tone directly in the prompt, and requiring the narrator to mention the specific place by name naturally within the story rather than in a preamble.

Geocoding specificity. Nominatim fails silently on informal or landmark-style queries, searching "the old Navy Yard gate" returns nothing, and the map just doesn't move. The fix was twofold: falling back to coordinates returned by the story API itself when Nominatim comes up empty, and passing natural language queries directly to Tavily rather than waiting for a clean geocoded address. The map now navigates correctly regardless of how specific or vague the input is.

The audio timing. Browser autoplay restrictions block audio that fires immediately after data arrives. The DOM needs the audio element mounted before .play() is called, which doesn't happen synchronously after a state update. A short setTimeout after the story response gives React time to commit the render before triggering playback. Simple fix, took an embarrassing amount of time to find :)

What we learned

Real-time AI pipelines feel magical when they work and infuriating when they don't. The margin between "this feels alive" and "this feels like a chatbot" is almost entirely in the prompt and the voice settings, not in the infrastructure. Spending the last 12 hours of the hackathon on prompt quality instead of new features was the right call.

History is everywhere. We tested Echoes on dozens of NYC locations while building it and kept getting surprised. The stories Tavily found, the 1920s speakeasies, the displaced communities, the wartime factories, the civil rights moments were more interesting than anything we could have invented.

Built With

  • cartodb
  • claude-(anthropic)
  • elevenlabs-(tts-+-scribe-stt)
  • leaflet.js
  • mediarecorderapi
  • next.js
  • nominatim
  • nominatim-(openstreetmap)
  • react
  • tavily
  • vercel
  • webaudioapi
Share this project:

Updates