Note: the website is fully deployed, functioning, and scalable at withonyx.com- the demo video might not be super helpful since you interact with Onyx through text/Whatsapp (ie. through phone), so feel free to register and play with it yourself! It's free (for now) and from my perspective, the more users the better!
Inspiration
Lauren Reeder and Michelle Fradin’s piece "The New Language Model Stack” is what inspired me to think of this idea.
In the piece, they emphasized the growing demand for vector databases (retrieval methods), foundational models (essential for any LLM application, of course), and monitoring/guardrails solutions in the LLM space. And yet, for all the talk of customization of these models to a company’s specific knowledge base, values and mission- there was zero reference to molding these products to fit the customer, the end user, better. It’s almost as if we’re supposed to believe that all humans, the most varied groups of living beings we know of, all speak and interact with text (which carries within it more latent meaning than anything I know of) in the same way.
This didn’t make sense to me until I realized that most of the applications for LLMs have thus far been oriented towards professional sectors. And nobody cares about how their logging solution tells them their kubernetes cluster is running inefficiently- as long as it does so, and accurately. Similarly, nobody cares about the tone with which an LLM summarizes their HR handbook, or performs any number of similarly emotionless tasks.
But that is missing a huge opportunity in my eyes. The real magic of this new technology is that it can interact with us in the most human-like of ways. We will have the technology that lets you find the most perfect fitting house, or plan the most beautiful wedding, just as a couple of examples, through a natural language interface that’s inviting and delightful.
In my eyes, there are only 2 things missing before this grand vision can become a reality.
- The accuracy
- Personality
I’ll leave aside #1 as it's been beaten to death and there are hundreds of companies and research labs tackling it from all directions.
Yet, nobody is working on figuring out how to make the personality of these agents fit the end user. This is a huge gap in the market, for a few reasons. We will need to feel at ease with these interfaces, delighted even, if we’re going to trust them with our most personal affairs - think, therapy, ordering products, buying houses. We cannot constantly onboard ourselves to new natural language interfaces offered by products. Imagine if you had a personal assistant that you have to reintroduce yourself to each time you asked them to perform a task for you. There will be a multi-billion dollar startup that creates the plug-and-play solution for companies looking to address their customers’ needs with a natural language interface.
But how do you build such a company? I’m basically talking about getting users to willingly give up a lot of data on themselves so that large companies can better tailor experiences to them. This sounds like some sort of predatory advertisement business. In that form, I don’t think a company would succeed. Instead, the company must delight its users while providing immense utility to them- and position its interfaces with existing products as a way to make their lives easier (which, it really is).
That is what I’m building. The steps to that vision are as follows. Provide utility to users at a tremendous value (in the form of a friend and text-based assistant), while allowing them to gain comfort with Onyx. [still at this step currently]. Offer too-good-to-be-true custom integrations. Imagine the task you most dread in the entire world- maybe it’s paying taxes, maybe it’s planning trips or orchestrating around your friends’ busy schedules- or maybe it’s planning a last minute wedding. We will solve that issue for our customer and it will be extremely easy (comparative to how other companies would find the task) due to all the data and trust we will be gathering. Integrate with every major consumer-focusing web-based product in the world. Amazon will have Onyx’s logo and interface, or at least pull data from it. Airbnb will as well. Onyx will be synonymous with natural language interfaces, because it will be what makes them work (for the average consumer’s use case). And it'll pull ahead of other “universal personal assistants” such as adept.ai because it will integrate at the application layer, not scraping through the presentation layer- as large companies will want us to integrate with them, not resist us.
If you don’t agree with my perspective shared here, you likely don’t agree on the immense depth of meaning in language provides.
And I suppose it isn't my place to convince you. But in my eyes- language, text, is imbued with meaning, with soul (or it can and should be). In order for a deep, human, natural language interface to work, you need to know a person’s life circumstances, values, mannerisms, and more. A company that specializes in exactly that is the only way that the more personal and useful applications of this technology can thrive.
What it does
It is the MVP for a gigantic vision to power and give life to the LLM economy for consumer-focused applications. This MVP focuses exclusively on the user’s experience, as user support and loyalty is what will give us the leverage to bring our product to enterprise customers.
It allows a user to sign up and connect their socials (Whatsapp, SMS, Instagram) and chat directly with Onyx. Over time, Onyx learns your preferences and will begin doing tasks for you. This iteration is mostly a companion that can keep you company and give advice, while maintaining a clear personality (due to the novel architecture we’ve developed). Future iterations will integrate with tools.
It’s fully functional and scalable (with the beauty of the cloud, could in theory handle a million users- leaving aside server costs haha.)
How we built it
Here is the overall architecture. (Not to give away our secret sauce). Frontend in NextJS using Tailwind, Framer Motion, and some WebGL. Hosted on AWS following a serverless architecture, copious use of Lambda functions, EC2 instances, and DynamoDB. Integrated with OpenAI’s GPT-4 model and various identity providers (Github, Gmail, Microsoft). The chat aspect itself is built using Python and Javascript libraries from Twilio, WhatsApp, and Postmark.
Challenges we ran into
Haha, I ran into a few. As expected during a hackathon/coding any meaningful project. Here are a fun couple off the top of my head- the challenges always end up being the most interesting part.
Extremely slow responses when fetching user’s identity in webhook endpoints. This one I solved by caching the users every once in a while, so every webhook request wasn’t immediately slowed down by an API call.
Wrong content type- I expected Twilio to accept application/json as the contentType, but I eventually realized I should set my API to use application/x-www-form-urlencoded instead, which is form data.
Trying to use the serverless framework to deploy my API endpoints from the command line. I ended up running “sls deploy” around 20 times, before realizing that I wasn’t going to be able to get the configuration working properly- so I went and set up my API the old fashioned way.
Typical sleep-deprived hijinks- misspelling “Mappings” as “Mapppings” and not realizing for half an hour, or calling my serverless function serverless_handler.py instead of serverless.py. Did not find a cure for this class of issues.
Hassles with pre-render fetching for NextJS. I’m fairly new to NextJS, so learning about the nuanced differences between it and React that matter was a ride.
Accomplishments that we're proud of
Building a fully functional “SAAS” product (despite it being free due to the pricing model) in 24 hours as one person. Architecting and implementing a novel personality structure for large language models, that can learn about the identity of the people it speaks with.
What we learned
Lots of things. Specifically about how to give rise to a good personality in a chatbot.
What's next for Onyx AI
We will begin exploring partnerships with companies building consumer-focused products with interest in natural language interfaces. The end goal is to be the personality infrastructure that powers huge industries, enabling companies like AirBnB, Zola, Kayak, and Zillow to seamlessly construct user experiences built around natural language interfaces, but we must first build up the moat of happy and committed users.
At the same time, we will continue exploring which aspects of language matter the most when it comes to making a personality-enabled chatbot.
Built With
- amazon-dynamodb
- amazon-web-services
- flask
- nextjs
- node.js
- openai
- postgresql
- python
- react
- twilio
Log in or sign up for Devpost to join the conversation.