Phil - AI-Powered Stoic Philosopher

Inspiration

College students are drowning in stress and anxiety, yet we're out here Googling "how to deal with stress" at 2am instead of tapping into 2,000 years of proven wisdom. The ancient Stoics—Marcus Aurelius, Epictetus, Seneca—literally wrote the book on managing adversity, but let's be real: nobody's reading philosophy textbooks when they're having a breakdown during finals week.

I wanted to build something that meets students where they actually are: scrolling on their phones, looking for quick answers, wanting something that just gets it. Phil isn't another generic wellness app—it's your personal Stoic philosopher who speaks your language, understands your problems, and delivers ancient wisdom that actually helps. Plus, I thought it'd be cool to see if I could make an AI sound like a 2,000-year-old philosopher while still being relatable.

What it does

Phil is a multi-modal AI companion that channels Stoic philosophy for student mental health. Here's the how it works:

RAG-Powered Wisdom: Phil doesn't just make stuff up—it retrieves actual passages from 1000+ embedded Stoic texts using Pinecone vector database. Every response is grounded in authentic philosophy, not hallucinations.

Multi-Modal Everything: Students can interact however they want:

Text Chat: Natural conversation with streaming responses
Voice Input: Too tired to type? Just speak (Gemini speech recognition)
Vision Analysis: Upload images of stressful situations, motivational quotes, or even philosophical thought experiments (yes, I tested it with the trolley problem)
Voice Output: Phil talks back with ElevenLabs TTS—perfect for when you're walking to class or lying in bed at 3am

Dual Aesthetic Themes: Toggle between Ancient Greek (columns, pottery, marble) and Cosmic Space (nebulas, stars, planets) because timeless wisdom deserves timeless design.

Privacy-First: No login, no data collection, no tracking. Just you and the Stoics.

How we built it

Tech Stack:

Frontend: Next.js 14 + TypeScript + Tailwind CSS
AI Services:
- Gemini 1.5 Flash for text generation
- Gemini 2.0 Flash Exp for vision (more on this later)
- Text Embedding 004 for creating 768-dim vectors
- ElevenLabs for voice synthesis
Vector DB: Pinecone Serverless for RAG
Deployment: Vercel (because it just works)

The RAG Pipeline:

User asks a question → convert to 768-dimensional embedding
Search Pinecone with cosine similarity for top 3 matches (threshold: 0.7)
Inject retrieved Stoic passages into Gemini's system prompt
Stream personalized response in real-time
Fallback to general Stoic principles if no good matches

Multi-Modal Magic:

Voice input: MediaRecorder API → Gemini Audio → transcription
Vision: FileReader → base64 encoding → Gemini Vision API
Streaming: ReadableStream with chunked transfer encoding
Audio: ElevenLabs API → HTML5 Audio element

Built 4 Next.js API routes (chat, vision, TTS, STT), wired them all together, added way too many CSS animations, and somehow made it all work in 24 hours.

Challenges we ran into

The Gemini Vision Hunt: Started with gemini-1.5-flash for vision analysis. Got a 404. Switched to gemini-1.5-pro—worked but was slow. Finally landed on gemini-2.0-flash-exp (experimental model) which was perfect. Three models later, we got there!

ElevenLabs Credit Crisis: Burned through my free tier credits fast because I forgot each response costs ~240 credits. Had to get creative with token limits and strategic muting during testing. Nothing teaches you API cost management like running out of credits at 2am.

RAG Threshold Tuning: Spent way too long finding the sweet spot (0.7 similarity threshold). Too low = irrelevant ancient text about farming. Too high = no matches at all. Built a fallback system so Phil always has something useful to say.

Streaming State Management: Getting text to stream in real-time while managing conversation state and handling errors was... an experience. But seeing responses appear word-by-word made it all worth it.

Solo Everything: Being a one-person team meant every bug, every design decision, and every 3am debugging session was on me. But also meant I could move fast and pivot quickly without coordination overhead.

Accomplishments that we're proud of

Actually Built Production RAG: Not just "I called an API"—built a complete vector search pipeline with embeddings, similarity scoring, context injection, and fallback logic. The responses are genuinely grounded in Stoic texts.

Multi-Modal Done Right: Seamlessly integrated text, voice input, vision, and voice output. Each modality feels natural and works together cohesively.

Solo Speed Run: Built this entirely solo in one hackathon weekend. From concept to deployment to demo video. Pretty proud of that.

What we learned

Vector Databases Are Cool: Building RAG from scratch taught me way more than any tutorial could. Understanding embedding dimensions, similarity metrics, and threshold tuning was invaluable.

Multi-Modal Coordination Is Hard: Each AI service has different requirements—base64 for images, audio blobs for voice, streaming for text. Making them all work together seamlessly requires careful architecture.

Streaming > Waiting: The UX difference between streaming and waiting for complete responses is massive. Users perceive streaming as instant even though total time is the same.

API Costs Matter: Free tiers run out. Token limits are real. Design with costs in mind from the start, not after you've already burned through credits.

Solo Hackathons Are Intense: No teammates to split work with, but also no coordination overhead. You move fast, make all the decisions, and own every line of code. Would recommend.

What's next for Phil

Scale the Knowledge Base: 1,000 passages → 10,000+ with better chunking and metadata

User Accounts: Cross-device sync, growth tracking, personalized insights

Daily Stoic Practices: Guided exercises, journaling prompts, morning/evening reflections

Source Citations: Show exact Stoic text sources (e.g., "Meditations 4.3") for transparency

Mobile App: Native iOS/Android for push notifications and on-the-go wisdom

Community Features: Share insightful conversations (anonymously), learn from others

University Partnerships: Integrate with campus mental health resources

Built for HackTX 2025

Built With

ai
elevenlabs
gemini-vision
google-gemini
mental-health
next.js
philosophy
pinecone
rag
react
speech-recognition
stoicisim
streaming
tailwind-css
text-to-speech
typescript
vector-database
vercel

Updates

Pranav Battini started this project — Oct 19, 2025 11:51 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.