Inspiration
When we went into building this project, we had just read a couple of articles about the mental health benefits of journalling and keeping a diary. Most people, however, find this to be a very high effort activity. So we wanted to build a solution that helps people keep track of the most important moments of their life automatically.
What it does
MemorAI is designed to be connected to wearables such as the Meta Ray-Ban glasses, or smartwatches, which have microphones. Users then record their entire day through the microphone on these devices, which is then processed by our application. Using AI, we extract the most emotional, meaningful, important moments from the user's day and store them as memories to access. Users can search through memories, which are automatically sorted into categories, and re-listen to audio clips of these precious moments. MemorAI ensures that you never forget the important moments again, and can relive them whenever you want. The app also has a chat feature, where you can chat with a map of your memories through a conversational interface to find out information about yourself that you may have forgotten.
How we built it
The core of MemorAI is a fine-tuned generative AI engine built on ChatGPT-4o. When a user uploads an audio clip, it's processed by first transcribing the speech content using Google's speech-to-text APIs, and the text is then chunked appropriately, preprocessed using custom NLP algorithms, and then sent into our generative AI. The AI analyzes the content of the recording, identifies the most memorable moments and categorizes and summarizes them. We then find the corresponding timestamps from the audio recordings and clip them. All this data is stored in MongoDB Atlas, where we also use vector embeddings to search through memories effectively. For our conversational experience, we implemented NLP algorithms that can convert our database of memories into a RAG-model, by classifying data into more useful categories, which are then connected to a GPT-based chatbot that users can access.
Challenges we ran into
One of the main challenges we faced was optimizing the audio processing involved in this application, since it is designed to work with hours of audio at a time. We started out by trying to process audios directly as inputs to generative AI, but this consumed a lot of processing and network overhead. We ended up implementing a two-stage transcription-analysis process using Google's APIs, which provided significantly better performance.
What's next for MemorAI
The main next step is integrating MemorAI directly with data feeds from wearable devices, rather than having users manually upload clips, so that data can be analyzed continuously, and user effort is reduced.
Built With
- generative
- natural-language-processing
- openai
- python
- rag
- speechtotext
Log in or sign up for Devpost to join the conversation.