Inspiration
What it does## Inspiration
Manual bug reporting is the silent productivity killer of software development. As a student, I realized that developers spend too much time documenting bugs and not enough time fixing them. I wanted to build a bridge between a simple screen recording and a professional technical report.
What it does
Gemini QA Autopilot leverages the multimodal power of Gemini 3 Flash to transform raw video evidence into actionable developer data. It "watches" screen recordings, identifies the exact moment of failure, and automatically generates:
- A professional, detailed bug report.
- An automated Playwright test script to replicate the issue instantly.
How I built it
The core of the system is a Node.js and Express backend. I used the Google Generative AI SDK to process video frames using the Gemini 3 Flash Preview model. The frontend is built with Tailwind CSS for a clean, professional dashboard that focuses on results.
Challenges I ran into
The biggest challenge was optimizing the visual reasoning. Getting the AI to distinguish between a normal UI transition and an actual application crash required precise prompt engineering and careful handling of video frame sequences.
Accomplishments that I'm proud of
I am proud of creating a tool that can "understand" an iOS crash just by looking at a screen recording. Achieving a seamless flow from video upload to a functional Playwright script was a major milestone for this project.
What I learned
I learned a lot about multimodal AI capabilities and how to handle large file uploads in a Node.js environment while maintaining a responsive user experience.
What's next for Gemini QA Autopilot
I plan to integrate the tool directly with Jira and GitHub Issues, and add support for more automation frameworks like Cypress and Selenium.
Log in or sign up for Devpost to join the conversation.