Recall AI

Landing Page
Home Page
Audio Log Recording Section
Audio Log Recording Active
Live Video Capture Transcription Section
Voice Clone Creation Section
Talk to Voice AI
Medical Dashboard
Flowchart

Inspiration

While there's the newer generations with the short attention spans and memory, who cannot put their phones down to cherish a memory for just a second , there are others suffering from cognitive decline, unable to conjure up moments of love and of family.......

What it does

Recall AI is a groundbreaking solution for those facing memory loss, like dementia and Alzheimer's, while also serving as a journaling tool for everyone. Our wearable camera captures daily moments from a first-person perspective, creating transcriptions stored securely in a vector database. Users can record audio logs that convert to text, allowing them to converse with a voice AI that brings their memories to life. Plus, our voice cloning feature preserves experiences in the user’s own voice for future generations. With reminder capabilities to track important tasks, Recall AI transforms how we remember, making it essential for both medical support and everyday life.

How we built it

We utilised Groq and Deepgram to tanscribe video and audio data to text and finally to vector embeddings into a SingleStore vector database. We use VAPI to create a conversational voice ai that answers questions regarding these memories. Deepgram converts our spoken audio to text and we use SingleStore's vector search feature to extract the necessary context which is passed to a Groq mixtral llm to generate the answer, Cartesia is used to convert text to audio spoken by the assistant , Cartesia also allows us to clone our own voice to be used for the assistant.

Challenges we ran into

In addition to integrating our custom LLM into the 3-step VAPI pipeline, we also faced significant challenges ensuring real-time support and maintaining a smooth transition in the user interface (UI). Achieving seamless performance while handling live audio, video capture, and real-time transcription required a great deal of optimization, especially when balancing it with a user-friendly interface. These technical hurdles required us to fine-tune the system for both speed and usability.

Accomplishments that we're proud of

Despite the challenges, we are proud of several key accomplishments. First, we developed our own RAG (Retrieval-Augmented Generation) database and retrieval capabilities from scratch, instead of relying on pre-built solutions from platforms like OpenAI or Hugging Face. This allowed us more flexibility and customization. Secondly, we successfully deployed our solution for public use, using ngrok to make it accessible, even on a temporary basis. Finally, we built an anytime-customizable voice cloning feature that allows users to update their voice samples at any time, making the voice AI more personal and accurate.

What we learned

One of the biggest surprises during this project was discovering how easy it is to create and utilize production-ready voice clones and speech-to-text-to-speech models. With the right tools, even a small team can achieve voice AI capabilities that would have seemed highly complex not long ago. We also learned a lot about integrating custom LLMs and handling vector-based data retrieval efficiently.

Future of Recall AI

Shared Space: Enhancing collaborative care by providing a platform for doctors, patients, and family members to interact and access shared memories.
Physical Devices Integration: Envisioning a future where specialized glasses equipped with cameras and microphones make memory capturing seamless and integrated into daily life. Or like wearable cameras in the form of necklaces, sharing of memories in the form of AIs, physical devices like alexas but with the AI of the people.