Forging
Forging is built as a multi-agent system on top of Gemini 3's Interactions API — not a prompt wrapper.
- 2-Agent Pipeline with Interaction Chaining: An Observer agent analyzes gameplay video with thinking_level="high", generating 10-20 timestamped tips. Its interaction_id chains to a Validator agent that cross-checks each tip against the video, assigning confidence scores 1-10. Only tips scoring 8+ survive. The video is uploaded once via the File API and persists across the entire chain — no re-upload needed.
- Multimodal Video Understanding: Gemini 3 Pro watches gameplay frame-by-frame alongside parsed replay data. For CS2, it reads HUD elements, crosshair placement, and positioning. For AoE2, it tracks resources, unit compositions, and build orders. No game API required.
- Structured Output with response_schema: Native JSON schema enforcement ensures deterministic output format at every pipeline step — timestamps, categories, severity, reasoning — directly renderable in the UI.
- Extended Thinking: Both agents use high thinking levels for deep reasoning. The Observer reasons about gameplay patterns; the Validator reasons about whether each observation is actually visible in the video or a hallucination.
- Follow-up Chat: Chat chains from the Validator's interaction_id, inheriting full pipeline context (video + analysis) without re-sending anything.
Inspiration
I'm an active competitive player in both Age of Empires II and Counter-Strike 2. Like most players trying to climb ranks, I've spent countless hours watching my replays trying to figure out what I did wrong. The problem? I'm not good enough to spot my own mistakes. And hiring a coach at $20-50/hour isn't realistic for regular sessions.
When I saw what Gemini could do with long-form video understanding, the idea clicked: what if AI could watch my gameplay like a human coach would - understanding visual context, identifying patterns, and giving me actionable advice with exact timestamps?
What it does
FORGING lets players upload their match replays or gameplay videos and receive AI-powered coaching. The system:
- Analyzes full matches (up to 30 minutes, 700MB videos) without chunking
- Generates timestamped coaching tips - click any tip to jump to that exact moment
- Enables contextual chat - ask follow-up questions with full match context ("Why did I lose that fight?")
- Works across game genres - currently supports CS2 (FPS) and Age of Empires II (RTS)
How we built it
The Stack
- Frontend: Next.js, React, TypeScript, Tailwind CSS.
- Backend: Python FastAPI.
- AI: Gemini 3 Pro via the Gemini API.
- Infrastructure: Google Cloud Run, Cloud Storage, Google Firestore
Gemini Features Used
- File API: Upload 700MB, 30-minute match videos.
- Multimodal: Analyze video + replay data + chat together.
- Thinking Mode: Deep reasoning for both agents
- Interactions API: Chain Observer → Validator with shared context
- TTS: Coaching tips with Voice over.
- Structured Output: Reliable JSON for UI rendering
Challenges we ran into
Hallucinations in Timestamps
Early versions would generate tips with timestamps where nothing relevant happened. The 2-agent architecture with explicit verification solved this - the Validator cross-checks every timestamp against the actual video. That said, once in a while you still get hallucinations that needs to be fixed for example when a grenade thrown by a nearby teammate and the system thinks it's you actually.
Prompt engineering for game-specific analysis
Getting the AI to understand game-specific concepts (CS2 economy, AoE2 build orders) without being too verbose or missing key moments.
Game-Specific vs Generic
Balancing game-specific knowledge (CS2 economy, AoE2 build orders) with a generic architecture was tricky. Solved with modular parsers and knowledge bases that plug into the same pipeline.
Rate Limits During Development
Heavy video analysis + thinking mode burns through quotas fast. Implemented API key rotation and caching of Gemini file uploads for iterative testing.
Accomplishments that we're proud of
Multi-Agent Verification Pipeline
The Observer → Validator architecture with confidence scoring dramatically reduced hallucinations. Starting from a single prompt using Gemini 2.0 Flash --> Gemini 2.5 Flash --> Gemini 2.5 Pro --> Gemini 3.0 Flash --> Gemini 3.0 Pro, to using a 4 Agent Pipeline and an Orchestrator, and few more variants, took a long time and I'm proud to have learned everything by doing. Learning when to verify and with what: the parser for Age of Empire II DE and Counter Strike 2 is key! Tips that don't match video evidence get filtered out before reaching the user. This went from "AI sometimes makes things up" to "every tip has been cross-checked".
Game-Agnostic Architecture
The same pipeline analyzes both Counter-Strike 2 (FPS) and Age of Empires II (RTS) - two completely different game genres with different visual languages, strategies, and skill sets. Adding a new game requires only a parser and prompts, not new infrastructure.
Voice Coaching
Tips are read aloud using natural TTS, turning the analysis into a spoken coaching session you can listen to while rewatching your gameplay.
End-to-End Deployment
Fully deployed on Google Cloud (Cloud Run, Cloud Storage, Firestore) with a live demo anyone can use. Not just a prototype - a working product.
What we learned
- Multi-agent systems need explicit verification - A second agent checking the first agent's work dramatically reduces hallucinations
- Long context changes everything - Not having to chunk a 30-minute match preserves crucial temporal relationships
- Structured output is underrated - Guaranteeing valid JSON from every response simplified the entire frontend integration
- Game-agnostic is possible - The same architecture works for an FPS and an RTS with only prompt changes
What's next for Forging
- Put it in hands of more users ASAP.
- Skill progression tracking - Compare your metrics across multiple games.
- Team communication analysis - Analyze voice comms for team coordination.
- Input analysis - Keyboard/mouse patterns and shortcuts optimization.
- More games: Valorant, League of Legends, Dota 2, Rocket League.
Built With
- fastapi
- gemini
- google-cloud
- google-cloud-run
- google-firestore
- nextjs
- python
- react
- typescript

Log in or sign up for Devpost to join the conversation.