Project Report: SamarthVaani Empowering Accessibility through Multilingual RAG

  1. Executive Summary SamarthVaani is a specialized AI-driven platform designed to bridge the information gap for India's Persons with Disabilities (PwD) community. While the government offers numerous welfare schemes, navigating complex official websites, language barriers, and documentation requirements often prevents the intended beneficiaries from accessing them.

This project implements a Retrieval-Augmented Generation (RAG) pipeline that translates legal and governmental jargon into simple, actionable guidance in 11 Indian languages, featuring a voice-first interface for maximum inclusivity.

  1. The Problem Statement Despite the existence of the RPwD (Rights of Persons with Disabilities) Act and various state-level benefits, two major hurdles persist:

Information Asymmetry: Scheme details are often buried in lengthy PDFs on fragmented government portals.

Language & Literacy Barriers: Most official documentation is in English or formal Hindi, which may not be accessible to rural populations or those with visual impairments.

Complex Onboarding: Moving from "knowing" about a scheme to "applying" for it involves a daunting paperwork process.

  1. The Solution: What is Built We built a full-stack, voice-enabled intelligence layer that sits between the user and the government data.

Key Features: Multilingual Voice Interface: Users can ask questions in their mother tongue (e.g., Tamil, Marathi, Bengali) and receive spoken responses.

Factual RAG Engine: Instead of relying on general LLM knowledge, the system only answers based on a verified "Gold Standard" dataset of 1,414 government document chunks.

Hybrid Search Mechanism: Combines semantic understanding (BGE embeddings) with keyword matching (BM25) to find specific scheme names and eligibility keywords accurately.

Automated Application Generator: A tool that converts a simple conversation into a structured application letter for UDID cards, pensions, or employment schemes.

  1. Technical Architecture The system is divided into three specialized layers:

A. Data & Vector Layer (The Brain) Data Sourcing: Web-scraped and manually verified data from the RPwD portals.

Storage: Hosted on Databricks, where data is chunked and indexed.

Vector Search: Utilizes the samarthvaani_vs_endpoint for sub-second retrieval of policy information.

B. Backend API (The Orchestrator) Framework: Built with FastAPI for high-speed asynchronous processing.

LLM: Powering the logic is Llama-3.3-70B, fine-tuned for instruction following and factual accuracy.

Voice/Translation: Integration with Sarvam AI's specialized models:

Saarika for Speech-to-Text.

IndicTrans2 for high-fidelity translation across 11 languages.

Bulbul for natural, human-like Text-to-Speech.

C. Frontend (The Experience) Stack: React + Vite + TypeScript.

Design: Tailwind CSS-based UI designed for high contrast and ease of navigation, ensuring the interface itself is accessible.

  1. Why This Approach? Accuracy over Creativity: By using RAG with a temperature of 0.1, we ensure the AI doesn't "hallucinate" fake schemes. It only speaks the truth found in official documents.

Inclusivity by Design: By prioritizing Voice Input/Output, we cater to users with visual impairments or those who struggle with typing.

End-to-End Utility: We don't just provide information; we provide the Application Generator, moving the user from a state of inquiry to a state of action.

  1. Conclusion SamarthVaani is more than just a chatbot; it is a digital advocate. By leveraging state-of-the-art LLMs and localized AI models, it ensures that government welfare is not just a policy on paper, but a reachable reality for every citizen, regardless of their language or physical ability

Built With

  • database
  • databricks
  • fastapi
  • react
  • sarwam
  • sdk
  • tailwinder
Share this project:

Updates