Parallax

Parallax - Autopilot for UX Testing
Dashboard with live testing tab showing UX persona agents and the task to perform at the given website
AI agents in action! The agents are going through the website and acting as their defined persona while testing the UX of the application
Journey replay of the UX testing with screenshots and thoughts and actions of each AI persona
UX report for the test run with helpful metrics

Inspiration

The internet is a diverse place, yet we test websites using a monolithic "ideal user" profile. 97% of web homepages have detectable accessibility failures, but traditional QA tools only catch what's in the DOM. We were inspired by the idea that a 72-year-old grandmother and a 20-year-old software engineer "see" the same website differently. We wanted to build a tool that doesn't just scan code, but actually experiences the web through diverse eyes using state-of-the-art Multimodal Vision.

What it does

Parallax is a Next-Gen UI Navigator Agent that uses Gemini 2.5 Flash to test websites through the eyes of diverse user personas simultaneously. Users can launch a single test run where "Martha" (72, low tech literacy), "Raj" (28, power user), and many others navigate the target URL autonomously.

Vision-First: It doesn't scrape the DOM. It captures screenshots and "sees" the UI.
Autonomous Navigation: It clicks, types, and scrolls based on visual hierarchy.
Real-Time Dashboard: You watch the agents "think" and act in a live stream.
Actionable Reports: It aggregates the cognitive friction detected by all personas into a prioritized UX report.

This helps teams catch real UX and accessibility issues like hidden CTAs, confusing navigation, or high‑friction flows without recruiting human testers for every release.

How we built it

We architected Parallax as a distributed system designed for scale:

Orchestration: Built with Google ADK (Agent Development Kit) to manage the multi-agent pipeline.
Core Logic: A FastAPI backend executes a vision-to-action loop using Gemini 2.5 Flash.
Browser Engine: Playwright handles the headless interaction and high-fidelity screenshot capture.
Frontend: A premium, dark-mode React dashboard (Vite) provides a real-time view via WebSockets.
Cloud Infrastructure: Fully containerized and deployed on Google Cloud Run, with Cloud Firestore storing report data and Google Cloud Storage persisting multimodal artifacts.

Challenges we ran into

The biggest technical hurdle was Docker optimization. A browser-based agent requires heavy dependencies (Chromium, system libs), which initially led to a 1.8GB image. We refactored the build process to use a slim Debian base and single-layer cleanup, reducing the size by nearly 50% for faster Cloud Run cold starts. Additionally, refining the coordinate precision for vision-based clicking required significant prompt engineering to ensure the agent didn't "miss" buttons.

Accomplishments that we're proud of

We successfully built a truly multimodal agent that acts with zero knowledge of the underlying source code. Seeing the agent autonomously navigate a complex "Table of Contents" on Wikipedia or handle a login flow purely through visual cues was a massive win. We are also proud of the UX Dashboard, which makes complex AI orchestration feel intuitive and alive.

What we learned

We gained a deep understanding of Gemini 2.5’s low-latency multimodal capabilities. We learned that LLMs are surprisingly good at understanding spatial relationships in UI when given the right context. We also mastered the Google ADK for multi-agent workflows, which allowed us to keep different agents' experiences completely sandboxed yet synchronized.

What's next for Parallax

Custom persona generation: Allow the users to create multiple personas based on their use case.
CI/CD Integration: A "Parallax Check" that runs automatically on every Pull Request.
Voice Integration: Using Gemini Live to let users "talk" to the agents as they navigate to ask "Martha, why did you stop there?"

Built With

docker
fastapi
gemini
google-artifact-registry
google-cloud
google-cloud-firestore
google-cloud-run
google-cloud-sdk
google-gen-ai-sdk
googleadk
javascript
playwright
python
react
uvicorn
vite
websockets

Updates

Vani Chitkara started this project — Mar 16, 2026 12:10 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.