-
-
Parallax - Autopilot for UX Testing
-
Dashboard with live testing tab showing UX persona agents and the task to perform at the given website
-
AI agents in action! The agents are going through the website and acting as their defined persona while testing the UX of the application
-
Journey replay of the UX testing with screenshots and thoughts and actions of each AI persona
-
UX report for the test run with helpful metrics
Inspiration
The internet is a diverse place, yet we test websites using a monolithic "ideal user" profile. 97% of web homepages have detectable accessibility failures, but traditional QA tools only catch what's in the DOM. We were inspired by the idea that a 72-year-old grandmother and a 20-year-old software engineer "see" the same website differently. We wanted to build a tool that doesn't just scan code, but actually experiences the web through diverse eyes using state-of-the-art Multimodal Vision.
What it does
Parallax is a Next-Gen UI Navigator Agent that uses Gemini 2.5 Flash to test websites through the eyes of diverse user personas simultaneously. Users can launch a single test run where "Martha" (72, low tech literacy), "Raj" (28, power user), and many others navigate the target URL autonomously.
- Vision-First: It doesn't scrape the DOM. It captures screenshots and "sees" the UI.
- Autonomous Navigation: It clicks, types, and scrolls based on visual hierarchy.
- Real-Time Dashboard: You watch the agents "think" and act in a live stream.
- Actionable Reports: It aggregates the cognitive friction detected by all personas into a prioritized UX report.
This helps teams catch real UX and accessibility issues like hidden CTAs, confusing navigation, or high‑friction flows without recruiting human testers for every release.
How we built it
We architected Parallax as a distributed system designed for scale:
- Orchestration: Built with Google ADK (Agent Development Kit) to manage the multi-agent pipeline.
- Core Logic: A FastAPI backend executes a vision-to-action loop using Gemini 2.5 Flash.
- Browser Engine: Playwright handles the headless interaction and high-fidelity screenshot capture.
- Frontend: A premium, dark-mode React dashboard (Vite) provides a real-time view via WebSockets.
- Cloud Infrastructure: Fully containerized and deployed on Google Cloud Run, with Cloud Firestore storing report data and Google Cloud Storage persisting multimodal artifacts.
Challenges we ran into
The biggest technical hurdle was Docker optimization. A browser-based agent requires heavy dependencies (Chromium, system libs), which initially led to a 1.8GB image. We refactored the build process to use a slim Debian base and single-layer cleanup, reducing the size by nearly 50% for faster Cloud Run cold starts. Additionally, refining the coordinate precision for vision-based clicking required significant prompt engineering to ensure the agent didn't "miss" buttons.
Accomplishments that we're proud of
We successfully built a truly multimodal agent that acts with zero knowledge of the underlying source code. Seeing the agent autonomously navigate a complex "Table of Contents" on Wikipedia or handle a login flow purely through visual cues was a massive win. We are also proud of the UX Dashboard, which makes complex AI orchestration feel intuitive and alive.
What we learned
We gained a deep understanding of Gemini 2.5’s low-latency multimodal capabilities. We learned that LLMs are surprisingly good at understanding spatial relationships in UI when given the right context. We also mastered the Google ADK for multi-agent workflows, which allowed us to keep different agents' experiences completely sandboxed yet synchronized.
What's next for Parallax
- Custom persona generation: Allow the users to create multiple personas based on their use case.
- CI/CD Integration: A "Parallax Check" that runs automatically on every Pull Request.
- Voice Integration: Using Gemini Live to let users "talk" to the agents as they navigate to ask "Martha, why did you stop there?"
Built With
- docker
- fastapi
- gemini
- google-artifact-registry
- google-cloud
- google-cloud-firestore
- google-cloud-run
- google-cloud-sdk
- google-gen-ai-sdk
- googleadk
- javascript
- playwright
- python
- react
- uvicorn
- vite
- websockets
Log in or sign up for Devpost to join the conversation.