SketchMage doesn't use the Gemini 3 family as a simple text generator, but as a physics and spatial validation engine. The application leverages the native multimodal capabilities and low latency of Flash models to transform static drawings into active game mechanics.
The architecture:
Multimodal Vision for Motor Validation: Instead of simple image classification, we send the video stream to the model to analyze stroke topology. Gemini evaluates line firmness, connectivity between points (A and B), and user intent, acting as a real-time tutor.
Structured Output: We implement a strict schema (responseSchema) that forces the model to act as a deterministic logic controller. Gemini returns Cartesian coordinates [x,y] and Boolean success flags, which our Flutter rendering engine interprets to animate objects on paper with pixel-precise accuracy.
Interactive Latency: We leverage inference speed to create an immediate feedback loop ("The Living Path"), crucial to maintaining the child's immersion without breaking the learning experience.
Log in or sign up for Devpost to join the conversation.