Phil - AI-Powered Stoic Philosopher
Inspiration
College students are drowning in stress and anxiety, yet we're out here Googling "how to deal with stress" at 2am instead of tapping into 2,000 years of proven wisdom. The ancient Stoics—Marcus Aurelius, Epictetus, Seneca—literally wrote the book on managing adversity, but let's be real: nobody's reading philosophy textbooks when they're having a breakdown during finals week.
I wanted to build something that meets students where they actually are: scrolling on their phones, looking for quick answers, wanting something that just gets it. Phil isn't another generic wellness app—it's your personal Stoic philosopher who speaks your language, understands your problems, and delivers ancient wisdom that actually helps. Plus, I thought it'd be cool to see if I could make an AI sound like a 2,000-year-old philosopher while still being relatable.
What it does
Phil is a multi-modal AI companion that channels Stoic philosophy for student mental health. Here's the how it works:
RAG-Powered Wisdom: Phil doesn't just make stuff up—it retrieves actual passages from 1000+ embedded Stoic texts using Pinecone vector database. Every response is grounded in authentic philosophy, not hallucinations.
Multi-Modal Everything: Students can interact however they want:
- Text Chat: Natural conversation with streaming responses
- Voice Input: Too tired to type? Just speak (Gemini speech recognition)
- Vision Analysis: Upload images of stressful situations, motivational quotes, or even philosophical thought experiments (yes, I tested it with the trolley problem)
- Voice Output: Phil talks back with ElevenLabs TTS—perfect for when you're walking to class or lying in bed at 3am
Dual Aesthetic Themes: Toggle between Ancient Greek (columns, pottery, marble) and Cosmic Space (nebulas, stars, planets) because timeless wisdom deserves timeless design.
Privacy-First: No login, no data collection, no tracking. Just you and the Stoics.
How we built it
Tech Stack:
- Frontend: Next.js 14 + TypeScript + Tailwind CSS
- AI Services:
- Gemini 1.5 Flash for text generation
- Gemini 2.0 Flash Exp for vision (more on this later)
- Text Embedding 004 for creating 768-dim vectors
- ElevenLabs for voice synthesis
- Vector DB: Pinecone Serverless for RAG
- Deployment: Vercel (because it just works)
The RAG Pipeline:
- User asks a question → convert to 768-dimensional embedding
- Search Pinecone with cosine similarity for top 3 matches (threshold: 0.7)
- Inject retrieved Stoic passages into Gemini's system prompt
- Stream personalized response in real-time
- Fallback to general Stoic principles if no good matches
Multi-Modal Magic:
- Voice input: MediaRecorder API → Gemini Audio → transcription
- Vision: FileReader → base64 encoding → Gemini Vision API
- Streaming: ReadableStream with chunked transfer encoding
- Audio: ElevenLabs API → HTML5 Audio element
Built 4 Next.js API routes (chat, vision, TTS, STT), wired them all together, added way too many CSS animations, and somehow made it all work in 24 hours.
Challenges we ran into
The Gemini Vision Hunt: Started with gemini-1.5-flash for vision analysis. Got a 404. Switched to gemini-1.5-pro—worked but was slow. Finally landed on gemini-2.0-flash-exp (experimental model) which was perfect. Three models later, we got there!
ElevenLabs Credit Crisis: Burned through my free tier credits fast because I forgot each response costs ~240 credits. Had to get creative with token limits and strategic muting during testing. Nothing teaches you API cost management like running out of credits at 2am.
RAG Threshold Tuning: Spent way too long finding the sweet spot (0.7 similarity threshold). Too low = irrelevant ancient text about farming. Too high = no matches at all. Built a fallback system so Phil always has something useful to say.
Streaming State Management: Getting text to stream in real-time while managing conversation state and handling errors was... an experience. But seeing responses appear word-by-word made it all worth it.
Solo Everything: Being a one-person team meant every bug, every design decision, and every 3am debugging session was on me. But also meant I could move fast and pivot quickly without coordination overhead.
Accomplishments that we're proud of
Actually Built Production RAG: Not just "I called an API"—built a complete vector search pipeline with embeddings, similarity scoring, context injection, and fallback logic. The responses are genuinely grounded in Stoic texts.
Multi-Modal Done Right: Seamlessly integrated text, voice input, vision, and voice output. Each modality feels natural and works together cohesively.
Solo Speed Run: Built this entirely solo in one hackathon weekend. From concept to deployment to demo video. Pretty proud of that.
What we learned
Vector Databases Are Cool: Building RAG from scratch taught me way more than any tutorial could. Understanding embedding dimensions, similarity metrics, and threshold tuning was invaluable.
Multi-Modal Coordination Is Hard: Each AI service has different requirements—base64 for images, audio blobs for voice, streaming for text. Making them all work together seamlessly requires careful architecture.
Streaming > Waiting: The UX difference between streaming and waiting for complete responses is massive. Users perceive streaming as instant even though total time is the same.
API Costs Matter: Free tiers run out. Token limits are real. Design with costs in mind from the start, not after you've already burned through credits.
Solo Hackathons Are Intense: No teammates to split work with, but also no coordination overhead. You move fast, make all the decisions, and own every line of code. Would recommend.
What's next for Phil
Scale the Knowledge Base: 1,000 passages → 10,000+ with better chunking and metadata
User Accounts: Cross-device sync, growth tracking, personalized insights
Daily Stoic Practices: Guided exercises, journaling prompts, morning/evening reflections
Source Citations: Show exact Stoic text sources (e.g., "Meditations 4.3") for transparency
Mobile App: Native iOS/Android for push notifications and on-the-go wisdom
Community Features: Share insightful conversations (anonymously), learn from others
University Partnerships: Integrate with campus mental health resources
Built for HackTX 2025
Built With
- ai
- elevenlabs
- gemini-vision
- google-gemini
- mental-health
- next.js
- philosophy
- pinecone
- rag
- react
- speech-recognition
- stoicisim
- streaming
- tailwind-css
- text-to-speech
- typescript
- vector-database
- vercel

Log in or sign up for Devpost to join the conversation.