Inspiration

Language is not just grammar- it is behavior, hierarchy, politeness, and emotion.

Apps like Duolingo teach vocabulary effectively, but they cannot teach how to:

Refuse indirectly in Japan

Bargain energetically in India

Negotiate assertively in New York

We were inspired by one core observation:

People don't fail in language because they lack words.
They fail because they lack context.

Samvad XR was born from the idea of building a "flight simulator for culture." A place where learners can safely make mistakes, offend an AI vendor, recover, adapt, and build real-world confidence- before they ever step off the plane.

What it does

Samvad XR is a culturally aware immersive language training platform.

Instead of flashcards, users enter a VR scenario (our MVP: Negotiating with a Street Vendor) and:

Speak naturally in the target language

Receive culturally contextual responses

Experience emotional consequences in real time

Core Features:

  1. Context Over Content Users practice full scenarios-not isolated phrases.

  2. Cultural RAG Engine We use a Retrieval-Augmented Generation pipeline that injects real etiquette rules dynamically into the LLM prompt:

    Japanese → indirect politeness

    Tamil Nadu → dynamic bargaining

    Western business → direct negotiation

  3. State-Aware Interaction (GraphDB) We track conversation states like:

OPENING → INQUIRY → HAGGLING → DEAL / ANGER

The AI vendor maintains a Happiness Score that reacts to:

Your price offers

Your tone

Cultural correctness

This makes every interaction adaptive and emotionally grounded.

  1. Indic Speech Integration We integrated Sarvam AI to support high-quality Hindi, Tamil, and Kannada speech interaction.

The result: When users travel, they don’t just translate words-they communicate with confidence.

How we built it

Our architecture combines AI, graph modeling, and speech systems:

  1. VR Frontend

Immersive street market environment

Real-time speech input/output

Emotionally responsive AI avatar

  1. Cultural RAG Backend

Vector database storing etiquette rules

Scenario-specific retrieval

Dynamic prompt injection into Claude

  1. Neo4j GraphDB (The Big Pivot)

We moved from stateless responses to a persistent emotional model using Neo4j.

We model:

Conversation states

Emotional transitions

Vendor Happiness Score

This allowed the AI to:

Hold grudges

Warm up gradually

Escalate if insulted

  1. Speech Layer

STT + TTS optimized for Indic languages

Support for accent + code-mixing (Hinglish)

Low-latency conversational loop

This combination turned a chatbot into a stateful cultural agent.

Challenges we ran into

  1. From Stateless Chatbot to Stateful Agent

Initially, our AI vendor forgot everything between turns. If you lowballed aggressively, the vendor responded politely again.

This broke immersion.

Solution: We implemented a graph-based emotional memory system in Neo4j:

Tracked states like OPENING → HAGGLING → ANGER

Maintained a persistent Happiness Score

Now the AI remembers your behavior and adapts.

  1. The “Culture Hallucination” Problem

LLMs often:

Reverted to generic assistant tone

Hallucinated stereotypes

Ignored specific etiquette nuances

Hardcoding prompts for 60+ languages was impossible.

Solution: We built a Cultural RAG pipeline:

Query etiquette vectors based on language + scenario

Inject cultural constraints dynamically

Keep context windows efficient

This ensured grounded, culturally accurate responses.

  1. Indic Language Tooling Gaps

Global providers struggled with:

Accents

Code-mixed speech

Regional dialect nuance

Solution: Integrated Sarvam AI for:

High-fidelity STT

Natural-sounding TTS

Authentic Indian language support

This made our MVP feel truly local — not translated.

Accomplishments that we're proud of

Successfully transformed a stateless chatbot into a stateful emotional agent

Built a scalable Cultural RAG system instead of hardcoded prompts

Created a dynamic Happiness Score model influencing behavior

Delivered authentic Indic speech integration

Designed a system that works for both consumers and corporate training

Most importantly:

We built something that feels human.

What we learned

Culture is harder than language.

Memory is essential for immersion.

Emotional state modeling dramatically increases realism.

RAG is more scalable than prompt engineering.

Local language optimization is critical for global products.

We also learned that true immersion requires without emotional consequence, learning feels fake.

What's next for SamvadXR

  1. Multi-Scenario Expansion

Business meeting in Berlin

Temple visit in Tamil Nadu

Formal Japanese office interaction

Airport emergency conversations

  1. Corporate Dashboard

Analytics on cultural adaptation

Emotional response heatmaps

Employee readiness scoring

  1. Advanced Emotion Modeling

Multi-agent simulation

Group negotiations

Non-verbal gesture recognition

  1. AI Mentor Mode

Post-scenario breakdown:

Where you violated etiquette

Alternative phrasing suggestions

Cultural explanation modules

  1. Scaling to 60+ Languages

Using our Cultural RAG backbone as a scalable foundation.

Built With

Share this project:

Updates