KK Agent

Inspiration

We wanted to build the fastest browser agent possible without sacrificing accuracy. Existing agents were too slow, so we focused entirely on latency reduction and atomic efficiency.

What it does

KK Agent is a high-speed multimodal browser agent. It navigates real-world web apps (Calendar, Email, Social) by visually processing the page and executing precise actions. It is specifically tuned to minimize the time between "seeing" and "acting."

How we built it

We stripped away every millisecond of latency through aggressive optimization:

Atomic Tools: Built replace_text and type_dropdown to combine multiple steps (click, clear, type, enter) into single actions, drastically reducing model round-trips.
Image Pipeline: Implemented local downscaling of screenshots before sending them to the API, cutting processing time significantly.
Latency Tuning: Reduced browser timeouts to 500ms and minimized typing delays for near-instant execution.
Model Config: Configured thinking_level and media_resolution to "low" on Gemini 3 Pro to prioritize raw speed.

Challenges we ran into

The "Naughty" Timeout: Dropping browser page timeouts to 500ms was risky—it made the agent incredibly fast but prone to race conditions which we had to handle gracefully.
Tab Detection: The model struggled to distinguish between navigation tabs and buttons, requiring specific prompt tweaks to fix.

Accomplishments that we're proud of

35s Benchmark: We successfully brought the Calendar task time down to ~35 seconds through our local downscaling and delay reduction strategies.
Reliable Inputs: The replace_text tool completely solved issues with clearing fields, making form filling bulletproof.

What we learned

Downscaling Wins: processing smaller images locally is faster than sending full-res images to the cloud.
Atomic > Chatty: Combining actions into single tools is the single biggest performance booster.

What's next for KK Agent

Refining the "Nano" model experiments for even faster local execution.
Re-enabling parallel task execution once the single-thread performance is fully maximized.

Built With

python

Updates

Ka'i Kau posted an update — Nov 23, 2025 04:24 PM EST

I will note that one of my optimizations is failing on the Calendar time dropdown but would likely succeed on Google Calendar, type_dropdown tool I created to reduce on tool calls and screenshots required.

Log in or sign up for Devpost to join the conversation.

Ka'i Kau posted an update — Nov 23, 2025 04:22 PM EST

Best run so far is 37.5% completion of the sample tasks provided. Total benchmark time of 986.54s.

Log in or sign up for Devpost to join the conversation.

Ka'i Kau posted an update — Nov 23, 2025 03:17 PM EST

Note, I have it set to not run tasks in parallel and I made a little GUI to select which tasks to run because I am a visual person.

Log in or sign up for Devpost to join the conversation.

Ka'i Kau started this project — Nov 23, 2025 03:11 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.