Most travel guides are either too factual (cold and list-like) or too fictional (fun, but unreliable). I wanted something in between: a way to feel the human texture of a place while still staying anchored to what’s checkable.
That idea became ChronoGuide AI: an AI-curated audio guide that lets you explore the same place through multiple perspectives—so history feels present, not distant.
About the Project
ChronoGuide AI transforms standard place information into an immersive, character-driven listening experience.
What it does
Search & Discover: Search a landmark or tap recommended locations.
Contextual Overviews: Get a clean overview including era, location context, and architectural style.
Perspective Switching: Choose from three AI-generated narration styles:
Curator-style guide: Neutral and structured.
Historical figure voice: First-person, informed by historical sources.
Era “witness” voice: Everyday perspective for atmosphere.
Transparency Built-in:
Creative Basis: Clarifies what’s dramatized to convey context.
Fact Anchors: Core checkable points.
Sources Panel: Shows the foundations of the narration.
Engaged Loading: During generation, short place-based Q&A appears so waiting time becomes learning time.
Pocket Mode: Supports audio-first listening for eyes-free exploration.
How I Built It (Gemini Integration)
Gemini is not an add-on—it’s the engine of the entire app:
Synthesizes narration with prebuilt voices (e.g., Charon, Fenrir, Aoede).
Voices are selected dynamically based on the character’s role, age, and tone.
Performance & Latency Strategy
To reduce waiting time for long-form narration:
Split each script into segments.
Run parallel TTS synthesis using the low-latency Flash series.
Merge audio buffers client-side for seamless multi-minute playback.
Challenges I Faced
Creativity vs. Accuracy: Storytelling needs guardrails. The Creative Basis / Fact Anchors / Sources structure keeps narration engaging without pretending to be first-hand testimony.
Latency & Quotas: TTS can hit limits under heavy traffic. I designed the UX so the app remains meaningful (via text and structure) even if audio is temporarily unavailable.
Voice Consistency: Keeping the same “character feel” across segments required consistent voice selection and careful segmentation.
What I Learned
UI-Safe Generation: How to make generative outputs reliable using structured JSON schemas.
Productized Storytelling: How to turn LLM outputs into a product experience centered on transparency and trust.
Perceived Speed: Improving user experience through parallel generation and interactive "loading" states like Q&A.
Gemini Integration in ChronoGuide AI
ChronoGuide AI leverages the power of Gemini 3 Flash and Gemini 2.5 Flash TTS to transform standard travel information into an immersive, character-driven auditory experience. The integration is not merely a feature but the engine driving the entire application.
Gemini 3 Flash serves as the narrative brain. It is utilized for its strong reasoning and creative capabilities to generate distinct historical personas—ranging from "Curator Guides" to authentic "Historical Figures" and "Witnesses." By employing JSON Mode (responseSchema), the model delivers strictly structured data, ensuring that complex outputs like location summaries, trivia questions, and multi-segment narratives integrate seamlessly into the UI.
Gemini 2.5 Flash TTS brings these personas to life. The app dynamically casts specific prebuilt voices (such as 'Charon', 'Fenrir', and 'Aoede') based on the gender, age, and role of the generated character, creating a deeply atmospheric audio experience.
Critically, the app takes advantage of the Flash series' low latency to implement parallel processing. It splits narrative scripts into segments and synthesizes them concurrently, merging the audio buffers on the client side. This ensures that users receive rich, multi-minute audio tours with minimal buffering, making history feel instant and alive.
Log in or sign up for Devpost to join the conversation.