Inspiration

As organizations face increasing scrutiny around data privacy, policy adherence, and security compliance, it becomes overwhelming to manually audit and interpret internal documents. Most GenAI solutions either rely on cloud APIs (creating privacy risks) or lack the depth to parse enterprise-specific content. Compliance AI was born from a simple question: "What if there was a private, offline AI assistant that could help any team answer compliance-related questions—without sending a single byte to the cloud?" That vision shaped our architecture: secure, local, voice-enabled, and explainable.

What it does

Compliance AI is a locally hosted, voice- or text-driven AI assistant that: Ingests internal PDFs and Word documents (like security policies, legal terms, PII guidelines) Answers user questions based on document content using vector-based retrieval Provides answers with confidence scores and source traceability Runs entirely offline, using Mistral via Ollama for both answers and guardrails Uses Whisper for real-time voice input Uploads interaction logs to AWS S3 for audit/compliance purposes Restricts access via a PIN-based gate and guards against unsafe queries using LLM moderation

How we built it

Frontend/UI: Tkinter-based GUI + CLI fallback Voice Input: Integrated sounddevice, wave, and Whisper for live audio capture and transcription Document Parsing: PyPDF2 and python-docx extract content Vector DB: FAISS used with HuggingFace embeddings to store and semantically search document chunks LLM Integration: OllamaLLM used to run Mistral locally for response generation and moderation Retrieval Pipeline: Built with LangChain to stitch all components into a smooth flow Logging: All interactions saved locally and optionally uploaded to a configured S3 bucket Deployment: Packaged for local launch and deployed to an AWS EC2 WorkSpace

Challenges we ran into

Guardrails & Moderation: It was difficult to enforce safety without adding latency or bias. We solved this by using the same model (Mistral) for input and output filtering, balancing performance and security. Whisper Integration: Handling voice input with live transcription while keeping the UI responsive required careful threading. Resource Constraints: Running everything locally meant optimizing for performance—Gemma 2B worked initially, but upgrading to Mistral after provisioning GPU support made a huge difference. S3 Log Uploading: Ensuring logs uploaded only on proper exit while maintaining resilience required careful signal handling.

Accomplishments that we're proud of

Built a fully offline, secure, explainable GenAI system Integrated real-time voice input without third-party APIs Designed our own guardrail mechanism using LLM-based self-moderation Enabled document traceability by citing sources and confidence levels Successfully tied in AWS S3 log compliance with optional upload toggle Went from zero to deployed on an EC2 WorkSpace with GPU support

What we learned

How to build secure RAG pipelines using LangChain, FAISS, and local LLMs The power of retrieval + generation when tuned properly (chunk size, top_k, etc.) Challenges of GPU resource management on AWS WorkSpaces Handling mic input and threading properly in Python The real-world complexities of building for enterprise privacy and compliance use cases

What's next for Compliance AI

Web-based dashboard using FastAPI or Streamlit for improved UX Add multi-document comparison (e.g., “What changed between version A and B?”) PII pattern detection inside ingested docs (regex + LLM hybrid) Support live document updating without restart Add multi-modal support (image-based compliance docs or scanned contracts) Optional Bedrock fallback in hybrid mode for power users

Built With

  • faiss
  • huggingface
  • langchain
  • mistral
  • ollama
  • openai
  • pypdf2
  • python
  • python-docx
  • sounddevice
  • tkinter
  • whisper
Share this project:

Updates