Inspiration
Journaling has always been a powerful tool for self-reflection, but we wanted to make it more natural, conversational, and accessible — like talking to a trusted friend. This inspired us to create Ansh.ai, a real-time agentic AI friend that listens to your thoughts, understands your emotions, and helps you process your day through meaningful conversation.
The name Ansh stands for Artificial Narrator for Self-reflection and Harmony, but in Sanskrit, it also means “a part of you.” This dual meaning captures our goal — building an AI that feels personal, empathetic, and truly connected to the user’s inner world.
What it does
Ansh.ai acts as a real-time conversational diary companion.
- You speak, Ansh listens.
- It reflects on your day with you, asking gentle questions and providing insights.
- It remembers past conversations, helping you connect patterns over time.
- Responses are presented not just as text, but through a talking AI avatar that feels more human and approachable.
Whether you’re venting after a long day, celebrating small wins, or reflecting on personal growth, Ansh is always there to listen and respond with thoughtful reflections.
How we built it
We combined multiple technologies into one seamless pipeline:
- Google Speech-to-Text for fast and accurate transcription.
- Llama 3.2 3B Quantized running locally via AMD hybrid architecture, fine-tuned to handle reflective conversations.
- ChromaDB + LangChain RAG to store and retrieve conversation history, enabling long-term memory.
- HeyGen Streaming API to generate talking avatar responses, adding a more human-like touch.
- All of this is orchestrated through a custom Flask API server .
Challenges we ran into
- Model Selection: Finding an LLM that balanced conversational quality, reflection depth, and efficient performance on our hardware was critical.
- Long-term Memory & Retrieval Logic: Designing a retrieval system that could pull meaningful memories from past conversations without slowing down the whole pipeline was particularly challenging.
- Real-time Performance: Each processing step — transcription, inference, retrieval, avatar generation adds latency. Keeping the system fast enough for fluid conversation required deep optimization across the stack.
Accomplishments that we're proud of
- Successfully built a full-stack agentic AI system that combines speech, memory, reflection, and avatar responses into one smooth experience.
- Created an AI friend that feels personal, remembers your stories, and grows with you over time.
- Learned to optimize each component (LLM inference, RAG retrieval, and external API calls) to work as efficiently as possible within our hardware limits.
What we learned
- Building multi-modal agentic AI systems requires balancing real-time interaction, memory management, and hardware constraints.
- Effective self-reflection conversations demand a different kind of LLM tuning, focused more on empathy and curiosity than factual answers.
- Retrieval systems need contextual smarts — pulling memories is not just about similarity, but also about emotional relevance.
What's next for Realtime Agentic AI friend for journal keeping (Ansh.ai)
- Performance Boost: Move from Flask to FastAPI to unlock better async handling.
- Hardware Optimization: Explore running Llama 3.2 1B on a local GPU to further reduce inference time.
- Memory Evolution: Build smarter long-term memory heuristics that adapt based on user emotion, not just keywords.
- Emotion Detection: Add sentiment and emotion detection from voice tone, enriching memory retrieval with emotional context.
- Multi-Modal Reflection: Allow users to optionally add photos or drawings to their reflections, making Ansh a richer, multi-modal journal companion.
Log in or sign up for Devpost to join the conversation.