Inspiration

Remember when every stick was a sword? Every cardboard box, a fortress? That sense of wonder never dies. It just waits. Forge the World was born from one question: "What if?"

What if your pen could rewrite fate? What if that icicle was Frostmourne, waiting to be claimed? What if the gap between fantasy and reality was just one camera click away?

We built a reverse-isekai RPG game where a blacksmith crashes into our world and sees the truth we forgot: everything has potential. Your desk isn't junk—it's an unforged arsenal. Your boring Tuesday? An adventure that hasn't started yet.

This is for everyone who's roleplayed with household items. Who's stayed up grinding in fantasy worlds. Who knows magic exists—if you know where to look.

Your world is your inventory. What will you forge?


What it does

Forge the World is a real-time survival defense RPG where reality becomes your inventory.

Gameplay:

  • Scan Objects: Point your camera at anything—pens, cups, scissors, snacks
  • Forge Equipment: AI transforms items into fantasy weapons/armor with unique stats
  • Cast Skills: Trigger healing, buffs, or attacks based on what you scan
  • Survive Waves: Battle monsters while managing limited "film charges"
  • Enhance Gear: Absorb materials to evolve your equipment

The Hook: The blacksmith misinterprets modern objects through a fantasy lens. Instant ramen? "This cylinder contains a sealed flame vortex! Remove the lid, and enemies taste hell!"


How we built it

Frontend (Mobile PWA)

  • Next.js 15 + TypeScript game engine
  • Zustand for state management
  • react-webcam for camera access
  • Custom game loop with bullet-time mechanics (0.1x slow motion)
  • PWA support for installable fullscreen gameplay
  • Animated WebP sprites with CSS blend modes for retro RPG effects

Backend (AI Pipeline)

  • FastAPI for AI orchestration
  • Gemini 2.0 Flash Lite for vision analysis:
    • Real-time object recognition + game logic decisions
    • JSON-structured outputs for consistent gameplay
  • Featherless AI (DeepSeek-V3) for flavor text generation
  • Gemini 2.0 Flash Image for procedural item artwork generation

Architecture

  • Three-tier AI pipeline:
    1. Vision (Gemini): Analyzes object properties and context
    2. Logic (Gemini): Determines stats, item type, and attributes
    3. Narrative (Featherless): Creates immersive fantasy descriptions

Challenges we ran into

  1. AI Latency: Image generation takes 2-5 seconds
    • Solutions: Asynchronous loading with placeholder animations, show stats immediately while images generate in background, bullet-time makes waits feel intentional rather than frustrating
  2. Prompt Consistency: Getting reliable object classification across diverse items required extensive iteration
    • Solutions: Structured JSON outputs with strict schemas, carefully crafted system prompts with examples
  3. Mobile Performance: Balancing visual quality with responsiveness on mobile devices
    • Solutions: WebP compression, strategic asset preloading, optimized state updates
  4. Context-Aware Skills: Teaching AI that coffee = buff, water = healing, sharp objects = damage
    • Solutions: Rule-based fallback logic combined with AI interpretation

Accomplishments that we're proud of

  • Seamless AI Integration: Three different AI models working in perfect harmony to create one cohesive experience
  • "Wow Moments": Watching everyday objects transform—staplers become "The Iron Crusher of Bureaucracy" with custom-generated artwork
  • Technical Innovation: Implementing bullet-time mechanics in a web-based game with smooth performance
  • Entertainment Value: The blacksmith's dramatic misinterpretations consistently delight players
  • True Mobile-First: Full PWA with offline support, installable app, and fullscreen gameplay

What we learned

  • Multimodal AI reduces latency: Combining vision + reasoning in one API call significantly outperforms separate service calls
  • Prompt engineering is critical: Small wording changes in prompts create massive differences in output quality and consistency
  • Perceived latency matters more than actual latency: Strategic time manipulation makes 3-second waits feel purposeful and cinematic
  • PWA capabilities have matured: Modern web APIs now rival native apps for gaming experiences
  • Structured outputs are essential: JSON mode provides necessary guardrails for reliable game logic in production

What's next for Forge the World

  • Multiplayer co-op: Team up to defend against harder waves
  • Persistent progression: Level systems and unlockable blacksmith dialogue
  • AR mode: Place forged items in your physical space
  • Community marketplace: Share your most creative item discoveries

Built With

  • deepseek-v3
  • docker
  • fastapi
  • featherless-ai
  • framer-motion
  • google-gemini-api
  • next.js
  • pillow
  • pwa
  • python
  • react
  • react-webcam
  • tailwind-css
  • typescript
  • uvicorn
  • webp
  • zustand
Share this project:

Updates