Inspiration
What if you could rewind time to make up for your past regrets? What if you could create a virtual version of yourself to avoid future mistakes? What if you could build a memory bank for your loved ones? What if you can have an AI spokesperson for yourself?
Millions around us wish to converse with their unavailable loved ones, create a virtual version of themselves and seek real human conversations to satisfy their personalized demands, for either emotional support or personal management. It is now a reality with AiEgo - build and interact with a real copy of yourself, loved ones, celebrities, or anyone imaginable.
What it does
AiEgo is an artificially sentient being that creates a sense of real interactive intimacy for users in an instant. Through the rapid learning of uploader’s chat history, photos, videos, and speech by a large language model, it generates an emotional connection with the user within the first few moments. It feels like chatting with an old friend, and that friend can be anyone imaginable. Able to be plugged into any mainstream SNS platforms, our test pilot is run on the Facebook Ecosystem (Facebook Messenger, WhatsApp, and Instagram).
AiEgo encompasses diverse functions and roles to cater to the needs of various users.
Spokesperson: AiEgo is designed to build a highly personalized human experience for the users based on their own histories. It acts like a spokesman for yourself to help you communicate with anyone e.g.the fan groups of influencers.
Emotional Support: AiEgo serves as a confidante for those seeking a safe space to express their feelings. It directs conversations towards emotional topics, enabling users to articulate their thoughts and feelings openly.
Life Coach: AiEgo can act as a personal life coach, helping users stay focused on their tasks. It provides gentle reminders and motivates users to follow through with their plans and commitments.
How we built it
A blend of UC Berkeley's PhDs and MBAs, our squad fuses a diverse reservoir of knowledge in text-to-speech, Natural Language Processing, and serial entrepreneurship to breathe life into our invention.
AiEgo is a mix of multiple cutting-edge models, with React as the front-end and FastAPI bolstering the back-end. We self-developed text-to-speech and lip sync models based on only 10-seconds voices and an automatic machine learning pipeline to summarize and learn personal information based on the raw data of user's chat history and personal statement. Using prompt learning and public tools such as OpenAI GPT-4 and Hume.AI, we are able to generate humane gene responses mimicking the users themselves.
Challenges we ran into
Our first challenge was model fine-tuning. Despite multiple efforts, achieving optimal model performance in such a short amount of time is difficult. Complicating matters was the integration of frontend, backend, and machine learning models for text, audio, and video processing. This required not just a vast breadth of knowledge but also a process of constant trial-and-error debugging.
The second significant challenge we grappled with involved the process of prompt learning. It demanded a meticulous design of prompts, each requiring a unique blend of precision and creativity, which posed a labor-intensive task to be deployed in a short period of time.
Accomplishments that we're proud of
We successfully deployed an authentic human-like experience on Facebook Messenger within a remarkably short timeframe, incorporating text, voice, images, and videos in the interactions. To expedite deployment, we synergized individual pieces of data and integrated them seamlessly into the final product. Each virtual copy boasts a blend of memory, voice, image, and name, each element contributed by a different team member. Try it, you WILL be impressed.
What we learned
The key learning from our journey was the combination of a multitude of APIs within a tight 24-hour development window. A secondary revelation was the critical need for prudent resource allocation during such challenges. In particular, we had to split our resources to develop the ChatBot and text to speech separately before stitching them back together.
What's next for AiEgo
Our roadmap contemplates honing the Machine Learning models, fine-tuning the voice and language models, and to achieve a delicate balance between latency and accuracy – pushing the envelope for minimal delay with maximal precision. We are also going to identify clearer user groups by beta testing.
Built With
- automatic-summarization
- hume
- lip-sync
- natural-language-processing
- openai
- prompt-learning
- text-to-speech
Log in or sign up for Devpost to join the conversation.