Inspiration
India’s compliance landscape is scattered and fragmented. Regulatory amendments are buried across countless websites with no centralized API or database. Compliance teams spend hours monitoring portals like FSSAI, RBI, SEBI, BIS, APEDA, and the e-Gazette, downloading PDFs, reading dense legal text, and manually mapping clauses to company operations. This is slow, error-prone, and expensive.
Vigilo was created to solve this problem at scale using AI-driven automation.
What It Does
Vigilo automatically scrapes, processes, and interprets regulatory amendments, then maps them directly to a company’s context. Here’s how it works:
Continuous Regulatory Scraping: Scrapers fetch new amendments from FSSAI, RBI, SEBI, BIS, APEDA, and the e-Gazette. Each PDF is parsed, cleaned, chunked, and enriched with metadata (sector, subsector, keywords).
Vectorized Knowledge Base: All chunks are encoded into dense embeddings using transformer models and stored in ChromaDB for ultra-fast, context-aware retrieval.
Granular Company Profiles: When a company registers, it uploads documents—labels, SOPs, policies, and supply-chain data—and selects its sector and subsector. Vigilo builds a structured profile with keywords and descriptors.
Hybrid Retrieval Layer: A combined system of metadata filters, keyword ranking, and cosine similarity ensures only the most relevant amendments surface (e.g., a coffee company sees coffee labeling updates, not poultry standards).
Multi-Stage Open-Source Model Pipeline:
MODEL_ANALYSIS_A(llama3-70b-8192) +MODEL_ANALYSIS_B(openai/gpt-oss-20b) extract clauses, deadlines, and classify domains.MODEL_DETAILS(openai/gpt-oss-120b) performs deep relevance matching between amendments and company documents.MODEL_COMPLIANCE(openai/gpt-oss-20b) conducts evidence-rich compliance checks and gap analysis.MODEL_OPTIMIZE(deepseek-r1-distill-llama-70b) aggregates findings, ranks urgency, and sets actionable deadlines.
Evidence-Backed Reports: Vigilo goes beyond keyword search to true semantic understanding—summarizing amendments, aligning them to internal policies, detecting risks, and producing actionable, evidence-rich compliance reports.
Real-Time Alerts & Dashboard: Insights are delivered through a FastAPI backend, an interactive React dashboard, and instant Gmail notifications to ensure compliance teams never miss an update.
By combining regulatory scraping, dense vector embeddings, hybrid retrieval, and a multi-model AI pipeline, Vigilo moves compliance from reactive manual tracking to proactive, automated intelligence at scale.
How We Built It — Vigilo
Vigilo was engineered from the ground up to tackle the monumental challenge of India's fragmented regulatory ecosystem. Our build was guided by a core principle: move beyond simple search to deliver deep, contextual, and automated intelligence.
1. The Data Firehose: Taming Unstructured Chaos
We built a robust, fault-tolerant scraping infrastructure that doesn't just download files — it understands them.
- Automatically categorizes raw text by authority, urgency, and affected industry the moment it's ingested.
- Transforms a flood of unstructured PDFs into a structured, query-ready stream of regulatory intelligence.
2. The Company DNA Profile: Beyond Basic Registration
When a company onboards, Vigilo doesn't just collect data — it builds a dynamic digital twin.
- By analyzing uploaded SOPs, product labels, and internal policies, we construct a living profile of a company's operational reality.
- Creates a unique fingerprint for hyper-personalized compliance matching.
3. The Reasoning Engine: From Retrieval to Understanding
The core of Vigilo isn't retrieval; it's reasoning. We architected a cascading AI pipeline where each model specializes in a critical cognitive task:
- Legal Analyst Model: Deconstructs complex amendments into their core actionable components.
- Forensic Auditor Model: Cross-references requirements line-by-line against the company's internal documentation to pinpoint exact gaps.
- Chief Compliance Officer Model: Synthesizes findings to prioritize risks, assign clear ownership, and generate a step-by-step action plan with hard deadlines.
4. The Tech Backbone: Engineered for Scale and Security
We chose a stack for enterprise-grade reliability and speed:
- FastAPI Backend — High-performance, low-latency processing even under heavy load.
- React Dashboard — Interactive, real-time compliance view for officers.
- Real-time Gmail Integration — Immutable audit trail ensuring critical updates never get missed.
Result
This architecture allows Vigilo to do what human teams cannot:
Process the entire regulatory universe in real time and deliver only the precise, actionable insights that matter.
Challenges We Ran Into
Inconsistent PDF Formats & Multilingual Texts
Regulatory amendments appeared in varied PDF layouts and languages, complicating automated text extraction and normalization.Smart Filtering for Relevance
Avoiding irrelevant recommendations (e.g., chicken labeling rules for a coffee company) required advanced domain classification and contextual filtering.Scaling Vector Search
Achieving fast and accurate semantic search across thousands of amendment chunks using FAISS for optimized storage and performance.Data Security & Privacy
Sensitive company data never left the system. We built an architecture using on-premise models and secure vector databases.
Accomplishments We’re Proud Of
- Built a working prototype that converts scattered, unstructured legal text into actionable compliance insights.
- Implemented a Retrieval-Augmented Generation (RAG) pipeline fine-tuned for India’s regulatory ecosystem.
- Designed real-time Gmail alerts for new regulatory amendments.
- Created an extensible architecture supporting industries beyond food and fintech.
What We Learned
- India’s regulatory landscape is extremely fragmented, creating major pain for compliance teams.
- Deep understanding of retrieval architectures: vector DBs, embeddings, and hybrid search.
- Implementing multi-agent prompt chaining to simulate human-like compliance reasoning workflows.
- Importance of metadata tagging and re-ranking for retrieval precision.
- Best practices for secure data handling in enterprise AI solutions.
What’s Next for Vigilo
- Deploy fully on-premises open-source LLMs using frameworks like vLLM or Ollama to ensure data never leaves company servers.
- Expand coverage to additional sectors such as healthcare, insurance, and manufacturing.
- Add predictive analytics for forecasting regulatory changes and risk scores. Launch a SaaS model with tiered pricing and API-as-a-Service.
Log in or sign up for Devpost to join the conversation.