Inspiration

Let's be real: Siri doesn't get it. I hate that my "assistant" is blind. It can't see my screen, and it definitely can't click buttons for me. I wanted to fix that. I built Lavis because I wanted an AI that doesn't just chat, but actually drives the computer—seeing what I see, clicking what I click.

What it does

Lavis is a digital human living on your Mac. It breaks out of the chatbox. Powered by Gemini 2.0, it watches your screen in real-time and uses your mouse and keyboard to get things done. No APIs, no special integrations. You tell it: "Send a WhatApp Message to Mom" or "Play some Jazz on Spotify," and it just moves the mouse and does it. It interacts with pixels, not code.

How we built it

We built this beast entirely in Java 21 & Spring Boot. Native screen capture that grabs pixels instantly. Gemini 3.0 flash analyzes the UI and plans the steps.

Accomplishments that we're proud of

It feels alive. Watching the mouse move in a smooth, human curve instead of a robotic jump is satisfying. We proved you don't need complex APIs to control a computer; you just need a smart vision model and a good pair of virtual hands.

What's next for Lavis

Speed and Memory. We're makin the reaction times more fast. Soon, you'll be able to talk to it in real-time while it works, just like a pair programmer.

Built With

Share this project:

Updates