UW-Crushes: The Voice-First Character AI for Waterloo

๐Ÿ’ก Inspiration

Letโ€™s be real: Waterloo students are world-class at solving LeetCode Hards, but "talking to actual humans" remains an NP-hard problem for many of us. We spend all our time in the DC Library or the E7 sub-basement, creating a campus culture that is notoriously socially isolated.

We saw the explosion of Character.AI and thought: What if we could build a voice-first version specifically for the chaotic, high-pressure ecosystem of UWaterloo? instead of just text, we wanted real-time voice interaction that feels like a FaceTime call with a frenemy.

๐Ÿš€ What it does

UW-Crushes is a hyper-realistic voice chat platform that lets students interact with AI agents modeled after distinct University of Waterloo archetypes. It's not just a chatbotโ€”it's a living, breathing simulation of campus social life.

Key Features:

  • ๏ฟฝ๏ธ Real-Time Voice Chat: Powered by LiveKit, users can have low-latency, natural voice conversations with agents. No typing, just talking.
  • ๐ŸŽญ The Crush Roster: Users can match and converse with hyper-specific distinct personalities:
    • The "Geosee/V1 Ghoul": Speaks in Discord emotes and hasn't seen sunlight in weeks.
    • The "Cali-or-Bust Coop Hunter": Turns every topic into a discussion about their Google offer and total compensation (TC).
    • The "First Year Engineer": Overwhelmed, caffeinated, and aggressively defending their faculty.
  • ๐Ÿง  Deep Context Awareness: Unlike generic AI, these agents know everything about UWโ€”from the specific disappointment of a laziz bowl to the migration patterns of Mr. Goose.
  • ๐Ÿ’˜ "Spotted at UW" Simulation: The platform recreates the thrill of the "UW Crushes" subreddit, allowing users to safely practice social interactions, venting, or just banningtering with agents who "get it."

โš™๏ธ How we built it

We architected a high-concurrency multi-agent system using a modular, state-of-the-art tech stack:

๐Ÿง  The AI Brain

  • Agent Logic (LangGraph & Autogen): We utilized LangGraph to orchestrate complex, stateful conversations and Autogen to create persistent "personalities" that remember your past conversations.
  • LLM Backbone (Gemini 1.5 Flash): We chose Gemini 1.5 Flash for its massive context window and ultra-low latency. It allows our agents to be witty and responsive without the awkward "thinking" pauses.
  • Sentiment & Memory: Agents track the "vibes" of the conversation. If you're boring, they might ghost you. If you're interesting, they'll "match" with you.

๐Ÿ–ฅ๏ธ The Frontend

  • Next.js Dashboard: A sleek, responsive interface built with Next.js offering a "Dating App" style UI where users can swipe through agent profiles.
  • LiveKit Integration: The core of our experience. We use LiveKit for real-time audio streaming, ensuring the voice chat feels immediate and personal.

๐ŸŒ Simulated World

  • Custom RAG Pipeline: We ingested specialized datasets (r/uwaterloo, Spotted at UW, academic calendars) to ground our agents in reality.

๐Ÿšง Challenges we ran into

  • Hallucination Tuning: Initially, our agents were too polite. Real Waterloo students are stressed and sarcastic. We had to fine-tune the agents to occasionally give "one-word replies" or be "too busy studying" to make the interaction feel authentic.
  • Latency is Theory, Lag is Reality: Making a voice conversation feel natural requires sub-500ms latency. We optimized our LiveKit + Gemini pipeline aggressively to minimize the "awkward silence" gap.
  • Defining "Cringe" Mathematically: We built a custom scoring engine to detect when a conversation is going downhill so the agent can react appropriately (e.g., by making an excuse to leave).

๐Ÿ† Accomplishments that we're proud of

  • The "Vibe Check": We successfully developed a system where agents react dynamically to your tone of voice and confidence, not just your words.
  • Scary-Accurate Personalities: Our beta testers reported genuine emotion (frustration, laughter) when talking to the "Aggressive Interviewer" or the "Clingy First Year," proving the fidelity of our simulation.
  • Seamless Voice: achieving a near-zero latency feel that makes you forget you're talking to an AI.

๐Ÿ“š What we learned

  • Voice is the Future of Social AI: Text is safe, but voice is vulnerable. Building for voice requires a much higher bar for empathy and timing.
  • The Power of Local Context: Generic "AI friends" are boring. An AI that knows your specific struggle (e.g., failing CS 246) creates an instant bond.

๐Ÿ”ฎ What's next for UW-Crushes

  • Group Chats: We want to enable rooms where multiple different AI agents (e.g., a quiet Math student and a loud AHS student) can interact with you simultaneously.
  • Physical Bot Integration: Connecting our localized agents to physical hardware for campus-specific interactions.

Thanks the data science club for the repo https://github.com/uw-datasci/AI-GF

Built With

  • huggingface
  • smolagents
Share this project:

Updates