Inspiration

Modern enterprises face a dual crisis: a relentless surge in multi-stage phishing attacks and zero-day repository exploits, alongside skyrocketing cloud bills driven by expensive, GPU-dependent AI security tools. Security teams want the analytical power of Large Language Models (LLMs) to scan raw email headers and Git commits, but corporate privacy policies strictly forbid sending sensitive proprietary code or internal communications to third-party commercial APIs.We built GuardLens to break this deadlock. It is a fully sovereign, localized triage pipeline designed to run high-throughput security classification entirely on cost-effective, CPU-only Arm64 cloud nodes [Track 2 description].

What it Does

GuardLens acts as an automated, localized gatekeeper for corporate digital streams. It features a stunning, Vercel v0-inspired developer analytics dashboard linked to an asynchronous Python FastAPI backend pipeline [Track 2 description].Ingestion: Security engineers or automated cron-jobs stream raw, un-redacted data (such as corporate email server logs or active developer Git commits) directly to a local endpoint [Track 2 description].Localized Triage: GuardLens evaluates the payload using a heavily optimized Microsoft Phi-3-mini (3.8B parameter) Small Language Model running locally on the server's CPU [Track 2 description].Deterministic Classification: Instead of generating slow, conversational filler text, the underlying model is bound by rigid system-level constraints to instantly return structured data strings parsing the payload's exact Threat Level (Low, Medium, High), Vector Category (Phishing, Exploit, Clean), and a concise Security Flag Reason [Track 2 description].Live Telemetry: The system logs execution speeds down to the millisecond, calculating live latency metrics to visually prove compute efficiency directly on the interface [Track 2 description].

How We Built It

GuardLens was built utilizing a lean, high-utility technical stack engineered to maximize Arm CPU-bound execution paths [Track 2 description]:The Backend Architecture: Driven by FastAPI and Uvicorn to handle ultra-low-latency asynchronous requests [Track 2 description].The Compute Engine: Powered by Hugging Face Transformers and PyTorch, configured explicitly for CPU execution (device_map="cpu") [Track 2 description]. The backend maps the model using torch.float32 structures, ready to take advantage of Armv9 SVE2 vector extensions or KleidiAI software dispatches for hardware acceleration [Track 2 description].The Frontend Interface: Built as a single-file, highly reactive dashboard using utility-first Tailwind CSS. It communicates natively with the local API using asynchronous vanilla JavaScript fetch structures, changing the UI state in real-time based on incoming risk variables (e.g., flashing a Rose-Glow terminal banner for high-threat exploits).

Challenges

We OvercameEliminating AI Hallucinations: Standard LLMs tend to chat. To ensure the output could be ingested reliably by downstream security infrastructure, we engineered a deterministic system prompt that forces the model to conform strictly to a predefined string template, parsing outputs effortlessly using standard string-splitting operations.Maximizing CPU Speed: Running heavy deep-learning inference on standard server processors without a GPU can easily cause latency spikes. We solved this by pairing a lightweight, highly capable 3.8B parameter model with precise float mapping, ensuring swift inference loops that execute in fractions of a second on modern cloud infrastructure [Track 2 description].

Accomplishments That We're Proud OfWe successfully built an independent, production-ready offline Python pipeline (/guardlens) that completely eliminates third-party API dependencies [Track 2 description]. We also designed an interactive full-stack live simulation layer using a secure background server, allowing judges to immediately interact with and benchmark our security payload modules using custom testing seeds with zero setup friction.

What We Learned

We proved that you do not need expensive, energy-hungry cloud GPUs to run reliable, enterprise-grade automated security agents. Quantized Small Language Models (SLMs) running on optimized, multi-threaded Arm CPU instances can comfortably handle high-throughput text and code evaluation loops with incredible accuracy [Track 2 description].

What's Next for GuardLens

CI/CD Integration: Packaging GuardLens into a GitHub Action that automatically blocks commits containing leaked API keys or zero-day vulnerabilities before they hit production.Fine-Tuning: Continually training the underlying SLM on the latest phishing vectors and active CVE exploit databases to ensure higher specialized accuracy.Advanced Quantization: Deploying via llama.cpp using 4-bit integer quantization (INT4) to drive infrastructure hosting costs down even further [Track 2 description].

Built With

Share this project:

Updates