Gemini Shape-Shifter

Inspiration We live in a world of static software. If you want to change a dashboard, analyze a new video format, or query a database, you usually have to call an engineer. We wanted to build a "Self-Constructing Interface"—software that adapts to the user, not the other way around. Inspired by the new multimodal capabilities of Gemini 3.0, we asked: "Can we build an app that re-writes its own code based on what it sees and hears?"

What it does Gemini Shape-Shifter is a Multimodal AI Agent that autonomously re-codes its interface to handle any file you drop:

Data Shape-Shifter: Drops a CSV? It writes a Python dashboard with Streamlit and Seaborn to visualize hidden insights instantly.

Vision Engine: Drops a hand-drawn UI sketch? It recognizes the wireframe and writes the frontend code to build it.

Video Intelligence: Drops a video file? Using Gemini’s Long Context Window, it watches the clip frame-by-frame to generate detailed summaries and analysis.

Voice-to-SQL Agent: Speaks a question? It translates natural language into executable SQL queries to answer complex database questions without typing.

How we built it We built the core engine using Python 3.12 and Streamlit for the frontend.

The Brain: We used google-generativeai to connect to Gemini 3.0 Flash Preview.

The Router: The app uses a "Router Pattern" to detect file MIME types (CSV, PNG, MP4, WAV) and route them to the specific AI modality.

The logic: For data visualization, we used an exec() loop that allows the AI to write and execute its own visualization code in real-time.

Deployment: We set up a full CI/CD pipeline connecting GitHub to Streamlit Cloud for instant deployment updates.

Challenges we ran into The biggest challenge was the "Headless Linux" Trap. We initially tried to generate video analysis previews using OpenCV (cv2) on the server. However, the cloud environment lacked the specific video codecs (h264, ffmpeg), causing the app to crash or generate 0-byte files. The Fix: We pivoted to a "Proxy Pattern." Instead of forcing the lightweight server to render heavy video, we built a retrieval system using requests to fetch standardized media for analysis, ensuring 100% stability.

Accomplishments that we're proud of True Multimodality: Successfully integrating Text, Vision, Audio, and Video into a single, unified interface.

The "Nuclear" Fix: Debugging a complex cloud deployment crash in the final hours and implementing a robust workaround.

CI/CD Pipeline: Setting up a professional DevOps workflow where every GitHub push automatically updates the live production app.

What we learned The power of Long Context: We learned that Gemini 3.0 can "watch" video with incredible accuracy, far surpassing traditional frame-sampling methods.

Deployment resilience: We learned that code that works on a Mac M4 doesn't always work on a Linux Cloud server, and how to handle dependency management (specifically opencv-python-headless) to fix it.

What's next for Gemini Shape-Shifter Real-time Database Connection: connecting the SQL Agent to live PostgreSQL databases.

Two-Way Voice: Adding text-to-speech so the Agent can talk back to us.

Self-Correction: Giving the AI the ability to "test" the code it writes and fix its own bugs before showing the result to the user.

Built With

google-gemini-api
numpy
opencv
pandas
python
seaborn
sqlite
streamlit

Submitted to

Gemini 3 Hackathon

Created by

I was responsible for the end-to-end development of Gemini Shape-Shifter. I architected the Python backend using a "Router Pattern" to handle multimodal inputs and built the interactive Streamlit frontend. My key technical contribution was debugging the cloud deployment on a headless Linux server, where I re-engineered the video analysis module to bypass codec crashes. I also set up the CI/CD pipeline to automate updates from GitHub to production.

Raj Kumar
student of 1st year in b.tech at marwadi university

Updates

Raj Kumar posted an update — Feb 10, 2026 01:22 AM EST

stremlit lags on and shows loading while starting app we have to give push(baking) for start

Log in or sign up for Devpost to join the conversation.

Raj Kumar started this project — Feb 09, 2026 11:53 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.