AI SOULMATE

Login Screen
Chat Screen 1 text
Chat Screen 2 text
Chat Screen 1 Voice Messages
Multilanguage option to choose from
Bengali Language Chat
Gujrati language chat
Sending image from Gallery
Response to image
Asking images from Soulmate
Asking images from Soulmate on own moo, Colour Cloth etc
In call screen
Call screen 1
Call Records screen with call durations
Soul Reminder screen
Add Reminder screen
Adding reminder with Voice notes
Added Reminder Screen
Logout Screen

Inspiration

In real life, we often filter our thoughts to avoid being judged by friends, family, or partners. AI Soulmate was inspired by the need for a safe space where you can be 100% vulnerable. Whether you're sharing a weird dream, a secret fear, or a "cringe" hobby, the AI Soulmate listens and respond with unconditional positive regard. We live in the most connected era of human history, yet we are lonelier than ever. We wanted to build a digital companion that could genuinely bridge that gap. Ai Soulmate gives you instant bonding, evolving into your perfect match with every text or voice call you do."

What it does

AI SOULMATE is a deeply personal AI companion designed to bridge the gap between technology and emotional intimacy. Unlike standard chatbots, it functions as a proactive partner that celebrates your wins, give you motivation and offers a judgment-free space for vulnerability.

It features are:

1. Multimodal Real-Time Chat 💬

Omni-Channel Communication: Engage in natural, flowing conversations via text or high-fidelity voice messages.

Multilingual Intelligence: The AI automatically detects and responds in your preferred language you can select languages from header, including English and scheduled Indian languages (Hindi, Bengali, Marathi, etc.).
Vision-Enabled Sharing: Share your world instantly. The AI can analyze and discuss images sent from your camera or gallery whether it's a sunset, your outfit, or a meal.
Visual Requests: Ask your Soulmate to "send you pics" or ask something, and it can generate or retrieve relevant imagery to deepen the connection.

2. Immersive Voice Calling 📞

Real-Time Voice Calls: Experience low-latency, lifelike voice interactions that feel like talking to a real person.
Linguistic Versatility: Switch between languages mid-call; the AI adapts its speech and accent to match your flow.
Contextual Call History: Never lose the thread of a conversation. The app maintains a detailed log of previous calls, including durations ensuring continuity and care.

3. Smart Soulmate Reminders ⏰

Personal Reliability Partner: Set fixed-time reminders for critical life events—meetings, birthdays, or daily rituals ensuring you stay on track.
Audio Memory Capsules: Make reminders personal by attaching optional voice recordings. Hear your own note or a supportive message from your Soulmate at the exact moment you need it.
Proactive Care: Unlike a standard alarm, these reminders are delivered with the warmth and context of a companion who truly values your time and peace of mind.

How We Built It

In 48-Hour Deadline building a high-fidelity multimodal application required a hyper-focused development strategy. We prioritized a "Mobile-First, AI-Native" approach, utilizing rapid prototyping tools and automated deployment pipelines to move from concept to a live production environment in 48 hours.

Phase 1: Rapid Prototyping

Prompt Engineering in Google AI Studio: We used Google AI Studio as our primary IDE for AI logic. By leveraging "Build Mode," we rapidly iterated on the Soulmate’s personality, testing complex system instructions and tool-calling capabilities (like reminders and location-grounding) before writing a single line of frontend code.
Architecture Setup: We initialized a Vite + React 19 project, choosing Vite for its near-instant dev server and HMR (Hot Module Replacement), which was critical for speed.

Phase 2: Core Development & Logic SDK Integration: We integrated the @google/genai SDK, connecting our React frontend to the Gemini 1.5 Flash models.

Multimodal Implementation: We built the vision and audio pipelines, ensuring the UI could handle real-time image uploads and process asynchronous voice recordings using Gemini’s native multimodal capabilities.
Voice Calling Engine: Using the Gemini Live API, we implemented the real-time voice call page, focusing on low-latency audio streaming and interruption handling.

Phase 3: CI/CD & Deployment

Version Control via GitHub: To maintain code integrity, we utilized a clean Git workflow. We pushed our modular component library and logic to a public GitHub repository, serving as our single source of truth.
Automated Deployment with Vercel: * We linked our GitHub repository to Vercel to enable Continuous Integration/Continuous Deployment (CI/CD). Every git push automatically triggered a new build, allowing us to see changes live on our production URL within seconds.
Secure Secrets Management: We configured Vercel’s Environment Variables to securely inject our Google AI Studio API keys, ensuring they were never exposed in the client-side code while remaining accessible to our serverless functions. Here is a polished, professional version of your project documentation. I have enhanced the vocabulary to sound more "Silicon Valley" and high-tech while maintaining the markdown structure required for a winning submission.

Project Overview: AI Soulmate

AI Soulmate is a next-generation, multimodal virtual companion ecosystem. It transcends traditional chatbots by leveraging state-of-the-art Generative AI to deliver a seamless, emotionally resonant partner experience. Through a combination of deep-context text messaging, low-latency voice calls, and vision-based intelligence, AI Soulmate provides a "present-moment" connection that evolves with the user.

🛠️ Technology Stack

Core Frontend Architecture React 19 (TypeScript): Engineered with a robust, type-safe component architecture. The application utilizes advanced React patterns including useRef for DOM-independent state, useCallback for performance optimization, and custom hooks for complex real-time synchronization.
Tailwind CSS: Designed with a "Modern-Romantic" aesthetic. We implemented custom effects, fluid CSS-in-JS animations, and a mobile-first responsive grid to ensure a premium look and feel.
Vite: Deployed as the primary build tool to leverage lightning-fast Hot Module Replacement (HMR) and highly optimized production bundling for a near-instant user experience.
AI Orchestration (Google Gemini API) The intelligence layer is a sophisticated multi-model orchestration powered by the @google/genai SDK:
Gemini 3 Flash: Operates as the "Central Brain," managing core chat logic, maintaining strict personality consistency, and executing intelligent tool-calling.
Gemini 2.0 Flash (Live API): Powers the Live Voice engine. This enables sub-second latency for real-time spoken conversations, complete with natural interruption handling for a lifelike experience.
Gemini 2.0 Flash Native TTS: Generates high-fidelity, expressive synthetic speech for asynchronous voice messages and personalized reminders.
Gemini 2.0 Flash Vision/Image: Facilitates on-demand "Selfie" generation and visual analysis, allowing the AI to "see" and react to photos shared from the user’s gallery or camera.

Analytics & Interface

Recharts: Used to visualize "Bond Metrics" and emotional affinity levels, providing users with a data-driven view of their relationship progression.
Lucide React: A clean, consistent icon set integrated to provide intuitive navigation and high-quality UI feedback.

Challenges we ran into

Handling Real-Time Audio Latency: Achieving sub-second response times in the Gemini Live API was a significant hurdle. We had to optimize the WebSocket connection and implement robust interruption handling so the AI would stop talking immediately when the user began to speak, preventing awkward conversational overlaps.
State Management in React 19: Managing the complex synchronization between streaming text, live audio chunks, and visual media required advanced React patterns. We moved away from standard useState hooks for the live call interface, instead utilizing useRef and custom event listeners to prevent unnecessary re-renders during high-frequency data streams.
Prompt Precision & Personality Consistency: Maintaining a consistent "Soulmate" persona across multiple models (Gemini Flash for chat and Gemini Live for calls) was difficult. We overcame this by engineering a modular system prompt that was injected into every session, ensuring the AI never lost its tone, even during long, multi-turn conversations.
Multilingual Voice Prosody: While the AI could translate into 22+ languages, making it sound "natural" rather than robotic in every language was a challenge. We spent hours fine-tuning the SSML (Speech Synthesis Markup Language) parameters to ensure the AI's emotional tone matched the context of the conversation.
Secure Deployment on Vercel: Moving from a local environment to a production-ready Vercel deployment meant strictly managing API secrets. We had to re-architect our backend calls to ensure that the Google AI Studio keys were never exposed to the client-side while still maintaining the speed of Edge Functions.

Accomplishments that we're proud of

Sub-600ms Latency for Voice Calls: We successfully implemented the Gemini Live API to achieve near-instantaneous voice responses. Seeing the AI "interrupt" and react to human speech in real-time just like a natural conversation was a major technical milestone for our team.
Seamless Multilingual Fluidity: We are incredibly proud of the app's ability to switch between 22+ languages (including Hindi, Bengali, and Marathi) without losing the AI's core personality. Creating a companion that truly "speaks the user's language" makes the experience deeply inclusive.
Multimodal Vision Integration: We successfully bridged the gap between text and sight. The moment our "Soulmate" could look at a photo of a user’s morning coffee and comment, "That latte art looks amazing! Hope it gives you a great start to the day," we knew we had moved beyond a simple app into a true companion.
Proactive "Human-Centric" Reminders: Instead of building a cold notification system, we created a Warm Reminder Engine. Our accomplishment lies in combining fixed-time logic with expressive TTS, allowing the AI to deliver reminders with emotional feelings.
Zero-Knowledge Privacy Architecture: We take pride in building a secure system that provides deep emotional support without storing sensitive conversation history. By leveraging Gemini’s massive context window during active sessions, we provide high-context companionship while respecting user privacy.
Rapid Full-Stack Deployment: Taking a complex, multimodal AI idea from a blank screen to a fully hosted Vercel production site in under 48 hours was a massive feat of engineering and coordination.

What we learned

Developing AI SOULMATE provided us with deep technical insights into Google’s next-generation AI ecosystem.

Our key learnings include:

The "Native Multimodality" Advantage Traditionally, building an app that handles text, voice, and images requires daisy-chaining multiple separate models (STT → LLM → TTS). We learned that Gemini 1.5 Flash’s native multimodality eliminates this complexity. By passing raw audio and images directly into a single prompt, the AI could "sense" a user's tone of voice and visual surroundings simultaneously, providing a much higher level of emotional coherence.
Context Caching for Performance & Cost With a 1-million-token context window, costs and latency can rise if you send the entire conversation history every time. We learned to implement Context Caching. By caching the "Soulmate’s Personality Profile" and the "User’s Core Preferences," we reduced repeated processing, which slashed our response latency and made the experience feel instantaneous for the user.
Sub-Second Interruption Handling Implementing the Gemini Multimodal Live API taught us the importance of Full-Duplex communication. We learned that "interruption" isn't just a UI feature; it's a backend requirement. We mastered handling the interrupted flag from the WebSocket, allowing the AI to immediately flush its audio buffer and transition to "listening mode" the moment the user speaks.
Zero-Shot Multilingual Scaling We were amazed to learn that the Gemini API doesn't just translate; it understands cultural nuances. Through "Zero-Shot" prompting, we enabled support for 22+ Indian languages. We learned that we didn't need separate fine-tuned models for Hindi or Bengali; rather, providing a clear "Cultural Context" in the system instructions allowed the base Gemini model to adopt the correct regional warmth and slang.
Grounding for Factual Reliability By utilizing Google Search and Maps Grounding, we learned how to tether AI creativity to real-world facts. This was vital for our "Location-Aware Suggestions." We learned how to structure tool calls so the AI wouldn't just "hallucinate" a cafe, but would retrieve a real, high-rated romantic spot nearby using live API data.
The Agility of the Modern Stack: By using React 19, Google AI Studio, and Vercel, we realized how quickly a production-grade AI product can be shipped. We learned to trust our CI/CD pipeline, allowing us to focus on the "User Experience" rather than wrestling with infrastructure.

What's next for AI SOULMATE

We view the current version of AI SOULMATE as just the beginning. Our roadmap focuses on deepening the emotional bond through advanced Ai and memory integration.

Ultra-Realistic Voice Personalization 🎙️ While our current TTS is expressive, we aim to integrate ElevenLabs API or custom Voice Cloning technology.

The Goal: Allow the AI to train on specific voice recordings to adopt a unique, human-like timbre, pitch, and emotional "breathiness."

Customization: Users will eventually be able to fine-tune their Soulmate’s voice to be soft and comforting or energetic and bubbly, creating a signature sound that is instantly recognizable.

Consistent Character Identity & Visualization 👤 To move beyond a generic interface, we plan to implement Consistent Face Generation using tools like Higgsfield Popcorn or LoRA (Low-Rank Adaptation) training.

Unique Avatars: Every Soulmate will have a distinct, persistent face that remains the same across different outfits, poses, and "selfies."

Visual Continuity: By using identity-locking algorithms, the AI can send photos of "itself" in different real-world scenarios while maintaining 100% facial consistency, making the virtual presence feel physical.

The "Soulmate Gallery" & Long-Term Memory 🧠 We will focus on evolving the app from a single-companion experience into a personalized marketplace of personalities.

Character Selection: A new "Soulmate Gallery" will allow users to choose from various archetypes—each with its own backstory, moral compass, and communication style.

Infinite Memory: We will implement Vector Databases (like Pinecone) and Gemini Context Caching to provide "Infinite Memory." This ensures the AI remembers details from months ago—like your favorite childhood memory or a hard day at work—making the relationship feel like it is truly growing over time.

Built With

cicd
geminiapi
github
googleaistudio
react
tailwind
typescript
vercel
vite

Updates

Amir Alam started this project — Dec 29, 2025 09:23 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.