Inspiration
In our fast-paced, busy lives, we’re often mentally overloaded and physically disconnected from the things that keep our world spinning. We’re stuck on the go, desperately wanting to book and plan our vacations, finalize reservations and daily plans, or do some time-consuming tasks without stopping everything we’re doing.
Despite how impressive they are, current digital assistants are still static and limited, and won't take real, autonomous action that will massively help balance your daily workload. I was inspired to bridge this gap. What if your software didn't just remind you of your life, but actually lived it with you? I envisioned a world where you could just text your personal agent and tell it exactly what the complex task you need to do is, as easily as you would talk to your friend.
What it does
Omni AI completes these complex, multi-step digital tasks for you without any intervention from you.
- Talk to Omni from Anywhere in the World - You can send a simple iMessage text or voice message from your phone to your computer wherever you are, and Omni will complete your task for you, then give you a summary once done. You can give natural language commands like, "I'm heading to SF; find me a Victorian tour and a healthy lunch spot, and book both"... Omni will always do the deep work for you, the more specific your commands, the better.
- Autonomous Task Execution - Unlike traditional assistants, Omni actually does the work. It uses deep specific context from past conversations and records like your transaction records (with user permission using Plaid API) and completes multi-step, complex tasks. It can handle things like real-world bookings, context-specific research, and complete resource-heavy and time-consuming boring work.
- Deep Context - Omni understands your life better than a standard assistant. By analyzing things like your bank transaction history (using Plaid API) and maintaining persistent context, it knows your budget, your favorite coffee, and your must-have routines, and much more... so that you will feel confident in Omni handling your tasks for you without your intervention.
- Real-Time Transparency - You’re never left wondering what Omni is doing. It displays its thinking steps in a live UI so you can watch what it's doing. Because it also runs in detached cloud sessions, it keeps working in the background even if you close your phone or computer, sending you a final confirmation once the job is done with a summary.
How I built it
Omni uses a multi-agent architecture to handle the complex bridge between digital planning and real-world execution, and also handle the various types of tasks someone may want completed. I used OpenManus as the central engine. It processes high-level intent from the iMessage or direct prompt and determines the precise sequence of web actions and research needed to fulfill a request.
To solve problems that require deep logic, I integrated OpenAI o3-deep-research as a fallback option. Omni maintains a rolling memory of user preferences and habits, so that it knows enough about you, and can confidently make decisions on your behalf. I integrated the Plaid API to pipe the user's latest transaction data (with permission) into the overall GPT-5 analysis. Omni uses this information in various ways like for budgeting and for understanding your buying trends, your hobbies, etc.
To make sure Omni never stops working, the backend runs in detached tmux worker sessions on a GCP cloud server. This allows for asynchronous task execution, you can send a command and close your phone while Omni handles hours of background research.
Challenges I ran into
A major challenge is making sure the tasks were done correctly. It’s one thing if a chatbot gives you a wrong fact; it’s entirely different if an agent books a non-refundable flight to the wrong city. Building the verification layer, where the agent pauses to double-check its final cart against the user's original constraints, was the hardest and most important logic to build. I stress-tested this by having the agent book free transactions (like walking tours in SF) without connected payment info, to make sure it could navigate checkout flows correctly.
Without access to things like accurate financial data or persistent user history, the models do not perform as well and can really mess up. An important part was building out the context that allowed Omni to perform it's tasks accurately and correctly.
Accomplishments that I am proud of
Complete Autonomy: The biggest accomplishment is the entire function of Omni, which is the ability to complete complex, multi-step tasks accurately and relatively quickly, while being completely unsupervised.
Asynchronous Architecture: This isn't just a script that dies when you close your laptop. You can send off a command...once it's received you can close both your computer and your phone, and Omni keeps working until the job is done.
What I learned
The Complexity of Real-World Reliability - I learned that multi-step task completion is messy. Building something that doesn't hallucinate information when booking something, but also reaches the right place, especially when there is no human to guide it, was challenging and tested the limits of my understanding.
I discovered that raw model power (like GPT-5) is only half the battle. Without the right data, like the financial context or persistent context about the user, even the smartest model in the world is not as useful. Good and relevant data is key.
What's next for Omni AI
I want to make Omni proactive / anticipatory, so that the model will learn to understand what you want before you even have to ask it, then all you need to do is approve the task, and Omni will complete the task just like it does right now. That way all of us can spend less time thinking and worrying about what we are not doing, and focus on things that are more important to us.
Log in or sign up for Devpost to join the conversation.