🌾 MandiBhav — WhatsApp Mandi Price Bot

Inspiration

Agriculture remains the lifeblood of rural India, employing over 50% of the national workforce. Yet, millions of smallholder farmers suffer from acute information asymmetry. Middlemen and brokers (arhtiyas) often exploit the lack of real-time market data to buy crops below fair market rates.

While the Indian Government maintains portals like Agmarknet.gov.in, these websites are:

  • Mobile-unfriendly and difficult to navigate on low-bandwidth rural connections.
  • Monolingual or poorly translated.
  • Static, requiring active searching rather than proactive alerts.

Farmers do not want to download heavy, memory-consuming mobile applications or navigate complex web dashboards. They want information immediately, in their local language, through an app they already use every single day: WhatsApp.

MandiBhav was born to bridge this gap. By combining the power of conversational AI and real-time mandi scraping, MandiBhav delivers daily prices and agricultural advice directly to a farmer's WhatsApp chat—free, fast, and in 10 Indian languages.


What it does

MandiBhav is an AI-powered conversational bot on WhatsApp that acts as a personal agricultural agent for farmers:

  1. Seamless Multilingual Onboarding: On the first message, the bot allows farmers to select their preferred language from 10 Indian regional languages (Hindi, Bengali, Telugu, Marathi, Tamil, Gujarati, Kannada, Malayalam, Punjabi, and English). The bot remembers this state for all future interactions.
  2. Real-time Price Inquiries: Farmers can ask natural questions in their own style (e.g., "Pyaaz ka bhav Indore mandi me kya chal raha hai?" or "Show me current potato prices in Bihar"). The AI extracts the commodity, market, and state, queries our backend database, and replies with clean, easy-to-read price listings (minimum, maximum, and modal price).
  3. Automated Daily Alerts (Subscriptions): Farmers can subscribe to get daily price notifications for their primary crops in their local market.
  4. Hallucination-free Farming Consultation: When asked general agricultural questions (e.g., "How do I control leaf curl disease in tomatoes?"), MandiBhav triggers a multi-step web search (DuckDuckGo + scraping top articles), synthesizes the text into a simple answer, and provides a citation link to avoid LLM hallucinations.
  5. Zero-Error UX: If a server or scraping job fails behind the scenes, the user is never exposed to raw code or technical errors. Instead, they receive a warm, polite fallback in their native language (e.g., "We are checking with the mandi, please give us a moment...").

How I built it

MandiBhav is built with a decoupled, high-performance service architecture:

  • WhatsApp Bridge (Node.js): Utilizes the Baileys library (v7) to interface directly with the WhatsApp Web WebSocket protocol. This allows running the bridge without high enterprise API costs. It exposes a simple Express web server to receive and send messages back to WhatsApp.
  • Backend API (FastAPI): A Python FastAPI application forms the core backend. It handles user session states, validates incoming webhooks, and serves REST endpoints for user subscription, commodity search, rates, and scraping pipelines.
  • AI Orchestration (NVIDIA NIM & Llama): Generative response generation is powered by LLMs hosted on NVIDIA NIM (specifically utilizing high-performance instruction models like Llama-3/Llama-4 variants). The system prompt and parameters are fully externalized into a prompts.json configuration file, allowing developers to change the AI's guidelines or swap models instantly with hot-reloading (no restarts required).
  • AI Tool Calling: The backend dynamically parses the model's output for special call patterns like (call GET /api/rates?commodity=...). FastAPI intercepts these calls, runs the local SQL/web queries, inserts the results back into the conversation context, and lets the model output a friendly summary.
  • Web Scraper (Playwright & Firefox): A custom headless browser automation engine that goes to Agmarknet.gov.in, interacts with complex dynamic dropdowns, updates the items-per-page to retrieve all rows, and parses the HTML table rows.
  • Task Queue (Celery & Redis): Scrapes are long-running and resource-heavy. We offloaded these tasks to a Celery worker queue backed by Redis to keep the main FastAPI web server completely responsive.
  • Hybrid Storage (SQLite & Redis): User session data and historical rates are stored in a SQLite database configured in WAL (Write-Ahead Logging) mode using SQLAlchemy ORM to ensure concurrent read/write safety. Hot conversation context is cached in Redis for rapid retrieval.

Challenges I ran into

  1. Dynamic Dropdowns on Government Portals: Agmarknet uses custom Peer-JS styled dropdowns instead of native HTML <select> elements. Playwright could not select values using standard methods. We solved this by implementing simulated keyboard events, typing values character-by-character, waiting for lazy-loaded floating panels, and pressing Escape to close widgets.
  2. Steering the LLM Away from Technical Jargon: In early testing, the AI model would explain its tools (e.g., "I am querying the database" or "I am calling the scraper API"). Farmers found this confusing. We overcame this by writing highly strict behavioral prompts and adding a post-processing regex filter on the backend that strips out all technical execution strings before transmitting the message to WhatsApp.
  3. Handling Concurrency in SQLite: Multiple incoming WhatsApp messages trying to write chat logs at the same time as the scraper writing thousands of rows resulted in database is locked errors. We solved this by enabling WAL mode, introducing scoped SQLAlchemy sessions, and offloading the scrapers to a separate Celery process.
  4. Baileys Session Dropouts: Unofficial WhatsApp bridges are prone to random disconnections. We implemented a self-healing connection loop in server.js that catches disconnect events and automatically restarts the socket with exponential backoff.

Accomplishments that I'm proud of

  • Zero-Friction UX: There is no barrier to entry. Farmers do not need an email, a password, or a new app. They scan the QR code once (on our server side) or message the bot, and they are online.
  • Hot-Reloadable AI: The ability to modify prompts, guidelines, and model temperatures on a live production server via prompts.json without dropping a single websocket connection.
  • Robust Fallback Engine: Building a system where database timeouts, API failures, or scraping blockages are elegantly handled and explained to the farmer in an encouraging, localized tone rather than presenting a generic crash message.
  • Localized Warmth: Getting the LLM to successfully adopt a respectful, warm tone (such as using greetings like "Namaste Kisan Bhai/Behen" in Hindi) that makes the farmer feel respected and listened to.

What I learned

  • Designing for Rural Accessibility: Tech adoption in rural communities thrives on minimal UI. WhatsApp is the de-facto operating system for rural India; building inside it is infinitely more effective than launching a separate app.
  • LLMs as Autonomous Interface Routing: We learned how powerful LLMs can be when treated as intent routers. By simply writing API paths in the system prompt, the model is capable of chaining search queries, scrapers, and local DB requests seamlessly.
  • The Importance of WAL Mode: For lightweight projects, PostgreSQL is often overkill. Understanding how to optimize SQLite with Write-Ahead Logging and proper connection pooling saved us days of infrastructure overhead.

What's next for MandiBhav

  • Voice Note Integration: Many farmers prefer sending voice notes over typing. We plan to integrate NVIDIA Riva or Whisper API to transcribe incoming audio messages, letting farmers ask questions hands-free.
  • Computer Vision for Disease Diagnosis: Allow farmers to upload photos of infected leaves. The bot will pass the image to a vision model (e.g., LLaVA or Llama-3-Vision) to diagnose the pest/disease and suggest treatments.
  • Predictive Price Alerts: Train lightweight models on our accumulated rate datasets to offer 7-day price forecasts and trend recommendations (e.g., "Garlic prices are expected to rise by 10% next week; consider holding your stock").
  • Weather-Triggered Broadcasts: Pair location data with micro-local weather feeds to send alerts before critical weather events (frost, heavy rains, heatwaves) so farmers can protect their crops in time.

Built With

Share this project:

Updates