Inspiration
Language is not just grammar- it is behavior, hierarchy, politeness, and emotion.
Apps like Duolingo teach vocabulary effectively, but they cannot teach how to:
Refuse indirectly in Japan
Bargain energetically in India
Negotiate assertively in New York
We were inspired by one core observation:
People don't fail in language because they lack words.
They fail because they lack context.
Samvad XR was born from the idea of building a "flight simulator for culture." A place where learners can safely make mistakes, offend an AI vendor, recover, adapt, and build real-world confidence- before they ever step off the plane.
What it does
Samvad XR is a culturally aware immersive language training platform.
Instead of flashcards, users enter a VR scenario (our MVP: Negotiating with a Street Vendor) and:
Speak naturally in the target language
Receive culturally contextual responses
Experience emotional consequences in real time
Core Features:
Context Over Content Users practice full scenarios-not isolated phrases.
Cultural RAG Engine We use a Retrieval-Augmented Generation pipeline that injects real etiquette rules dynamically into the LLM prompt:
Japanese → indirect politeness
Tamil Nadu → dynamic bargaining
Western business → direct negotiation
State-Aware Interaction (GraphDB) We track conversation states like:
OPENING → INQUIRY → HAGGLING → DEAL / ANGER
The AI vendor maintains a Happiness Score that reacts to:
Your price offers
Your tone
Cultural correctness
This makes every interaction adaptive and emotionally grounded.
- Indic Speech Integration We integrated Sarvam AI to support high-quality Hindi, Tamil, and Kannada speech interaction.
The result: When users travel, they don’t just translate words-they communicate with confidence.
How we built it
Our architecture combines AI, graph modeling, and speech systems:
- VR Frontend
Immersive street market environment
Real-time speech input/output
Emotionally responsive AI avatar
- Cultural RAG Backend
Vector database storing etiquette rules
Scenario-specific retrieval
Dynamic prompt injection into Claude
- Neo4j GraphDB (The Big Pivot)
We moved from stateless responses to a persistent emotional model using Neo4j.
We model:
Conversation states
Emotional transitions
Vendor Happiness Score
This allowed the AI to:
Hold grudges
Warm up gradually
Escalate if insulted
- Speech Layer
STT + TTS optimized for Indic languages
Support for accent + code-mixing (Hinglish)
Low-latency conversational loop
This combination turned a chatbot into a stateful cultural agent.
Challenges we ran into
- From Stateless Chatbot to Stateful Agent
Initially, our AI vendor forgot everything between turns. If you lowballed aggressively, the vendor responded politely again.
This broke immersion.
Solution: We implemented a graph-based emotional memory system in Neo4j:
Tracked states like OPENING → HAGGLING → ANGER
Maintained a persistent Happiness Score
Now the AI remembers your behavior and adapts.
- The “Culture Hallucination” Problem
LLMs often:
Reverted to generic assistant tone
Hallucinated stereotypes
Ignored specific etiquette nuances
Hardcoding prompts for 60+ languages was impossible.
Solution: We built a Cultural RAG pipeline:
Query etiquette vectors based on language + scenario
Inject cultural constraints dynamically
Keep context windows efficient
This ensured grounded, culturally accurate responses.
- Indic Language Tooling Gaps
Global providers struggled with:
Accents
Code-mixed speech
Regional dialect nuance
Solution: Integrated Sarvam AI for:
High-fidelity STT
Natural-sounding TTS
Authentic Indian language support
This made our MVP feel truly local — not translated.
Accomplishments that we're proud of
Successfully transformed a stateless chatbot into a stateful emotional agent
Built a scalable Cultural RAG system instead of hardcoded prompts
Created a dynamic Happiness Score model influencing behavior
Delivered authentic Indic speech integration
Designed a system that works for both consumers and corporate training
Most importantly:
We built something that feels human.
What we learned
Culture is harder than language.
Memory is essential for immersion.
Emotional state modeling dramatically increases realism.
RAG is more scalable than prompt engineering.
Local language optimization is critical for global products.
We also learned that true immersion requires without emotional consequence, learning feels fake.
What's next for SamvadXR
- Multi-Scenario Expansion
Business meeting in Berlin
Temple visit in Tamil Nadu
Formal Japanese office interaction
Airport emergency conversations
- Corporate Dashboard
Analytics on cultural adaptation
Emotional response heatmaps
Employee readiness scoring
- Advanced Emotion Modeling
Multi-agent simulation
Group negotiations
Non-verbal gesture recognition
- AI Mentor Mode
Post-scenario breakdown:
Where you violated etiquette
Alternative phrasing suggestions
Cultural explanation modules
- Scaling to 60+ Languages
Using our Cultural RAG backbone as a scalable foundation.
Log in or sign up for Devpost to join the conversation.