## Inspiration

Every AI chat tool forgets you the moment you close the window. You explain your project, your files, your context, then do it all over again next session. We wanted an AI that actually remembers. Not a chatbot but a coworker that knows your project and can do real work on your computer, not just talk about it.

We also wanted to prove that Gemini 3 can power a full autonomous agent, not just answer questions. Give it tools, give it memory, and let it figure out multi-step tasks on its own.

## What it does

Neo is a desktop AI coworker powered by Gemini 3. You point it at any folder on your computer and it uses Gemini to index everything, generating semantic summaries stored in .neomemory/. Next session, Gemini already knows your project.

It's not just for developers. Writers use it to proofread and organize drafts. Researchers use it to search the web and compile findings. Students use it to break down assignments. Anyone with files on their computer can use Neo.

Gemini 3 Flash handles everyday tasks fast. Switch to Gemini 3 Pro for deeper reasoning on complex problems.

## How we built it

Tauri v2 (Rust + native WebView) for the desktop shell, React 19 + TypeScript for the UI, and the Google GenAI SDK talking directly to Gemini 3.

The core is a Gemini-powered agent loop: you send a message, Gemini 3 decides which tools to call from 30+ available (file I/O, shell, web search, memory, todos), Neo executes them, feeds results back to Gemini, and the loop repeats until the task is done. Gemini's large context window is critical here because we inject the full memory context at every turn so the model never loses project awareness, even across long multi-step tasks.

We rely heavily on Gemini 3's native tool calling. The model receives typed tool schemas via function declarations and returns structured tool calls that Neo executes directly. No prompt hacking, no regex parsing. Gemini's structured output makes the agent loop reliable.

The memory system itself is Gemini-powered: every file summary in .neomemory/ is generated by Gemini 3, which means the model is reading and understanding your code/docs semantically, not just listing file names.

## Challenges we ran into

Tauri v2's shell security model was the biggest early blocker. Command.create() requires a named shell scope in a capabilities config, and the first argument is a scope identifier, not the program path. Every command failed with "program not allowed" until we figured out the config.

We also discovered that Tauri's Command.execute() returns stdout/stderr on the result object, but we were using event listeners which only work with spawn(). Every shell command returned empty output. Subtle API difference, hours of debugging.

On the Gemini side, we had to handle parameter naming mismatches. Gemini 3 sometimes sends path when the tool expects file_path. We built a parameter alias system that normalizes these automatically so the agent doesn't break.

## Accomplishments that we're proud of

The memory system genuinely works. You can close Neo, come back a week later, and Gemini already knows your project structure, naming conventions, and past decisions. That feeling of not re-explaining your codebase is something no other desktop AI tool gives you.

We also got Gemini 3 running as a real autonomous agent, not just a Q&A bot. It chains tools together, handles errors, retries on failures, and self-corrects. The loop detection catches when Gemini gets stuck repeating itself and nudges it to try a different approach.

## What we learned

Gemini 3's tool calling is genuinely good enough to power an autonomous agent loop. The structured function call output means you don't need fragile prompt engineering to get reliable tool use. The large context window makes it practical to inject full project memory at every turn without running out of space.

We also learned that persistent memory changes how people interact with AI. When the tool remembers context, you stop writing long prompts and start talking to it like a coworker.

## What's next for Neo

Gemini's multi-modal capabilities open up a lot: drag in screenshots for UI feedback, process PDFs and images, use the camera for whiteboard-to-code workflows. We want to add voice input using Gemini's audio understanding so non-technical users can just talk to Neo. And collaborative memory, so teams can share a .neomemory/ and Gemini understands the whole team's context.

Built With

Share this project:

Updates