Inspiration
As large language models become more powerful, connecting them to personal or enterprise data becomes essential for contextual reasoning. However, sending raw files—including sensitive content—to cloud-based APIs raises significant privacy concerns. We were inspired to bridge this gap by building a system that could preserve privacy without sacrificing the utility of intelligent, cloud-based reasoning.
What it does
LlamaGuard is a secure preprocessing layer built on top of the Model Context Protocol (MCP). It runs locally and performs intelligent summarization, filtering, and formatting of local files before any data is sent to external LLMs. It supports diverse file types (e.g., .pptx, .jpeg, .pdf), detects and redacts sensitive content, and outputs standardized JSON ready for MCP ingestion.
How we built it
We built a custom MCP server that takes in local file directories, runs them through a local instance of the LLaMA model using Llama stack, and processes each file via summarization, privacy classification, and formatting. We designed a JSON schema to represent file content in a compressed and structured manner, allowing for compatibility with downstream LLM systems. Key components include:
Llama stack for running LLaMA locally
File parsers for diverse formats (.docx, .pptx, .csv, .jpeg, .pdf)
Privacy filtering and summarization modules
A formatting pipeline that outputs structured JSON
Challenges we ran into
Building a flexible pipeline that could handle diverse file types consistently
Designing a privacy detection system that balanced accuracy with speed
Ensuring summaries retained semantic context while significantly reducing file size
Integrating the LLaMA stack locally without sacrificing runtime performance
Accomplishments that we're proud of
Successfully preprocessing and summarizing over a gigabyte of mixed-format files
Building a privacy-aware system that detects and redacts sensitive information automatically
Supporting real-world file formats like images and presentations, which typical MCP servers ignore
Creating a scalable and secure JSON output format ready for downstream use
What we learned
Local model inference (especially with LLaMA) is viable and powerful when optimized correctly
Privacy filtering and summarization are essential preprocessing steps in real-world AI systems
Building useful AI infrastructure requires both thoughtful architecture and deep understanding of real-world user data constraints
What's next for LlamaGuard
We plan to extend LlamaGuard to support real-time file monitoring, allowing agents to dynamically ingest new local context. We also aim to incorporate multimodal reasoning capabilities (e.g., better vision-language summarization), tighter integration with agent frameworks, and fine-grained user control over what content gets processed or shared. In the long term, LlamaGuard could become a core layer in privacy-preserving AI systems, both personal and enterprise-grade.
Built With
- docker
- javascript
- llamastack
- ollama
- python
- typescript
Log in or sign up for Devpost to join the conversation.