Inspiration

The spark for STEM Labs wasn't just a "good idea"—it was a necessity born from a real-world bottleneck.

We started by building bespoke 3D virtual labs to help science students in Africa who lacked access to traditional physical laboratories. Our mission was to democratize science education. However, we hit a wall: scalability. Teachers loved our labs but immediately asked, "Can I create my own lab for this specific topic?"

At the time, the answer was "No." Building a single custom interactive lab took us 6 weeks—from concept design and 3D modeling to interaction programming. We were the bottleneck.

We realized that to truly solve this problem, we didn't need more developers; we needed a way to turn the world's largest library of science content—YouTube—into active simulations instantly. That is how STEM Labs was born.

What it does

STEM Labs is an AI-powered engine that transforms passive video tutorials into active, interactive web simulations in real-time.

A user simply pastes a URL to a science video. Our system analyzes the video's visual and audio context using Google Gemini and instantly generates a playable "Micro-App" alongside the video. Students can then manipulate variables (like gravity, velocity, or chemical volume) to verify what they just watched, turning a passive viewing experience into an active scientific experiment.

How we built it

We built STEM Labs using a "Multi-Agent" AI Architecture powered by Google Gemini. When a video URL is submitted, we first fetch the video context. We use a specialized "Architect" prompt that analyzes the video frame-by-frame to extract the core scientific concept and then outputs a strict JSON Design Spec.

The Engineer: This agent takes the JSON spec and writes the actual executable code. We instructed it to generate a Single-File HTML5 Application using modern web standards (Canvas API for physics, Tailwind for UI) so it can be rendered instantly without a build step.

The Runtime: We built the frontend with React and Vite for a snappy, responsive experience. The generated simulation runs inside a secure sandboxed iframe, communicating with the main app via the post Message API.

Challenges we ran into

First is the rate limit for Gemini usage which didn't allow us complete the testing and building of our app.

Second is the "Hallucination" Trap: Early versions of the AI would sometimes generate a generic "Music App" or "Todo List" regardless of the video input because it was over-fitting to our few-shot examples. We solved this by implementing a "Lens" prompt strategy that forces the AI to extract specific visual cues from the video before writing a single line of code.Latency vs. Accuracy: Generating a full physics engine takes time. We optimized this by splitting the task: Gemini Flash handles the rapid code generation, while the heavier Gemini Pro handles the complex reasoning for the design spec. Physics is Hard: getting an LLM to understand what needs to be calculated on every frame of a request Animation Frame loop was tricky. We had to refine our system prompts to act as a "Senior Gameplay Engineer," enforcing specific coding patterns for simulation loops.

Accomplishments that we're proud of

From "6 Weeks" to "60 Seconds": Our biggest pride is technical validation of our core thesis. We successfully reduced the development time of an interactive educational simulation from a manual 6-week process to a nearly instant AI-generated workflow.

Taming the LLM Hallucinations: We are proud of our "Lens" prompting strategy. Getting a text-based AI to "watch" a video and strictly adhere to its visual physics—without drifting into generic templates—was a massive hurdle. We successfully engineered a pipeline where the AI acts as a strict "Reality Mirror" rather than a creative writer.

The "Architect-Engineer" Workflow: We successfully implemented a multi-agent system where Gemini 1.5 Pro handles the complex pedagogical reasoning (The Architect) and Gemini 1.5 Flash handles the rapid code execution (The Engineer). Getting these two models to "talk" to each other via a JSON Spec was a significant engineering win.

Democratizing Development: We built a tool that turns consumers (teachers/students) into creators. We are proud that a biology teacher in Lagos with zero coding knowledge can now build a custom mitosis simulator just by pasting a YouTube link.

What we learned

We learned that Multimodal AI is the key to unlocking the "Last Mile" of education. Video content is abundant, but practice is scarce. By bridging this gap, we aren't just saving 6 weeks of development time; we are giving every teacher the power to become a software engineer and every student the keys to their own private laboratory.

What's next for STEM LABS

The "Classroom Mode" (LMS Integration): Currently, STEM Labs is a solo experience. Our next step is to add Teacher Dashboards where educators can curate playlists of generated labs, assign them to students, and track their quiz performance and "Lab Reports" in real-time.

Offline-First Architecture: Since our target demographic includes students in regions with unstable internet (like parts of Africa), we plan to implement a Progressive Web App (PWA) strategy. This would allow generated labs to be cached and played offline once downloaded.

Community "Lab Store": We want to build a dynamic, user-generated library (replacing our current static examples.json). If a user generates a particularly amazing simulation for "Quantum Entanglement," they should be able to publish it to the global "Lab Store" for others to use.

Multimodal "Textbook-to-Lab": We plan to expand the Video2App engine to support image inputs. Imagine snapping a photo of a diagram in a physical physics textbook, and having STEM Labs instantly bring that static diagram to life as an interactive simulation.

Built With

Share this project:

Updates