Inspiration

A lot of LLM powered smart assistants are prone to hallucination. I want to build a simple self-hosted smart assistant that is significantly LESS prone to hallucination and can get my stuff done in ONE SHOT prompt. (Sorry been prompting too much and I just type all CAPS now).

Why pit LLM against each other?

I imagine a world where everyone has their own AI system that works in their favor and where third party agents hosted remotely can compete for user's interest and engagement in front of user-defined AI agent where user has the final say in their preferences, as my selection agent prioritizes price between Uber and Lyft.

What it does

With the short time frame, we only built 5 agents to showcase its capabilities: memory, research(web search), calendar, Uber and Lyft. For now it can answer general information about you stored in the Memory agent like "what's my wife's birthday" or "what's my passport number", but the difficult part is handling multi-agent queries in one shot like "take me to my next meeting" or "take me home". Each agent executes within a sandbox so I think this architecture allows more agent to be added without increasing complexities for the LLM. Its capabilities today admittedly aren't the most impressive, but I believe the real magic lies in how little this system hallucinates and its ability to say no to the user.

How we built it

Bidding System

When a query is evaluated into the system, we poll each agents with a simple question of "can you do it?" and eliminate the ones that answered no. Then we started parallel execution of the task for each agent and wait for each agent's first user-facing chat completion; once all the results are in, we called in a selection LLM(7B) to pick out the winning agent with preference given to 1. relevance and 2. price competitiveness

Two classes of Agents

Of the five agents, Calendar, Research and Memory are considered "first-party" agents as these two agents hold sensitive and privileged user information. Uber and Lyft are considered "third-party" agents as they don't store sensitive personal information and can request access permission to "Memory" or "Calendar" to unlock more use cases.

Agent Graph

With Uber and Lyft agents connected to Memory and Calendar, queries that requires multi-agent hop like "get me a car to my next meeting" or "take me home" will simply work in one shot.

Challenges we ran into

Getting LLM to say Yes or No was not the most straightforward.

Accomplishments that we're proud of

The architecture is easy to explain but yet works well and fast for my use case so far. With the structure, I can add third party agents easily and can see how a developer platform can form for integrating API with an agent.

What we learned

Keeping the context at minimum to LLM is important

What's next for Smart assistant with graph-connected and bid-placing agents

Product-ionizing it on my home Mac environment so I can host my own smart assistants. Make it smarter by adding more 1st party agents like search, contacts and reminders. More 3rd party agents such as integrating with my smart home system.

Loom link https://www.loom.com/share/d75d4eebf6da4079b81b88cda3e6d3bd

Built With

Share this project:

Updates